-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some html files result in UnicodeDecodeError when red by BeautifulSoup #60
Comments
But did I give you all of the accuweather data? or just a sample? |
Yes, there are other files with the same problem. Can you handle this in your code? |
I think you gave me all accuweather data last week. I will try to handle it in my script, later today. |
Are the files you get the error for new files from last week? |
No, for instance accuweather_02-06-2016_17:07_bielefeld_daily_d5_1464880053.html |
Why should I handle this in my script? Don't you except errors anyways and when error occurs, its just not gonna save it to the database. There is not more I can do anyways. And since the scraped data still has some files which will give AssertionErrors (from my checks), e.g. the files from april which are france cities instead of german ones (we didnt delete those), we will have to except assertion errors anyways. I will write a log file where alle excepted errors are logged, so we can see what happens. |
Should be handled in your script as shown in #78 |
I couldn't find a solution, but it seems that the error only appears for 3 files (from all files downloaded so far) and all of them from the same download time. So I guess if nobody has already encountered and fixed this problem, just remove following files from the server:
The text was updated successfully, but these errors were encountered: