Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run accuweather scraping #78

Open
denisalevi opened this issue Jul 11, 2016 · 2 comments
Open

Run accuweather scraping #78

denisalevi opened this issue Jul 11, 2016 · 2 comments
Assignees

Comments

@denisalevi
Copy link
Contributor

Well @erensezener I couldn't pull from the server since you have merge conflicts... I don't want to mess around with your stuff, so please run my scraper after pulling the new version. But please run it like this:

assertion_count = 0
unicode_count = 0
others_count = 0
file_count = 0
for day, month, city in itertools.product(days, months, cities):
    date_string = '{}-{}-{}'.format(day, month, year)

    try:        
        sc_ac(date_string, city, DATAPATH)
    except Exception as ex:
        if ex == AssertionError: assertion_count += 1
        elif ex == UnicodeDecodeError: unicode_count += 1
        else: others_count += 1
        print('Excepted {} in accuweather'.format(type(ex).__name__))
        print(traceback.print_exc())        
        with open("accuweather_excepted_errors.txt", "a") as myfile:
            myfile.write('Excepted {} for date={}, city={}:\n{}\n\n'.format(type(ex).__name__, date_string, city, ex))
    file_count += 1

tot = assertion_count + unicode_count + others_count
print('\n\nFinished saving ACCUWEATHER data fo database\n Excepted errors: {}/{}\n\tAssertionErrors: {}\n\t UnicodeDecodeErrors: {}\n\t Other errors: {}\ndetails saved in accuweather_excepted_errors.txt'.format(tot, file_count, assertion_count, unicode_count, others_count))

with open("accuweather_excepted_errors.txt", "a") as myfile:
    myfile.write('\n\nFinished saving ACCUWEATHER data fo database\n Excepted errors: {}/{}\n\tAssertionErrors: {}\n\t UnicodeDecodeErrors: {}\n\t Other errors: {}\ndetails saved in accuweather_excepted_errors.txt'.format(tot, file_count, assertion_count, unicode_count, others_count))

This excepts all errors and creates a file accuweather_excepted_errors.txt with details about them. Could you please post that file here after running the scraper? Thank you.

@erensezener
Copy link
Contributor

Ok I am running it now. It will probably take at least couple of hours.

@erensezener
Copy link
Contributor

erensezener commented Jul 13, 2016

It took ~ 5 hours, here are the outputs: https://drive.google.com/file/d/0BwQc_CC3arWWOE5YZVpZT1pQQTQ/view?usp=sharing

If it looks ok, please close the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants