Home>

I want to remove the extra \ r \ n from the list of articles I got by scraping, but I get the following error:

AttributeError:'list' object has no attribute'replace'


usage environment
GoogleCrome 86.0.4240.75 (64bit)
JupyterLab 2.1.5
Write a program in a Jupyter Lab notebook

The article in question looks like this.

#!/usr/bin/env python
#coding: utf-8
from bs4 import BeautifulSoup
import urllib.request as req
import pandas as pd
import numpy as num
import re

url = "https://www.msn.com/ja-jp"
response = req.urlopen (url)

soup = BeautifulSoup (response,'html.parser')
lists = soup.find_all (href = re.compile ("/ ja-jp/news")) #Path is displayed at the bottom right of the site
lists [1:21]

select = []
url_select = []
for list in lists:
    select.append (list.string)
    url_select.append (list.attrs ['href'])

selected = select.replace ('\ r \ n','')
selected


Lastselect.replace,select.strip ()Or tryselect.re (r'(. +) \ R \ n')I tried, but all the errors were as above.
Does that mean that the specified object has no attributes to make sense of the error? Even so, I don't know how to deal with it.
I would appreciate it if you could let me know what you think.

  • Answer # 1

    listIs a built-in function, so it has a different namelstLet's use something like that.

    When I actually move it,lst.stringYou can get some things that are None, so after removing themreplace replaceWhat you didappendI think it's okay.

    for lst in lists:
        if lst.string:
            select.append (lst.string.replace ('\ r \ n',''))
            url_select.append (lst.attrs ['href'])

  • Answer # 2

    selectIs a list, not a stringreplace replacecan not.

    For strings that are elements of the listreplace replacelet's do it.

    I mean,select.appendBeforelist.stringFor (this is a string)replace replaceI think I should do it.