We have a big list of domains we need to check and get an activity score for. To do this our thought was to attempt to get the [login to view URL] file and then parse it to get a last modified date. then we would do some basic math on this to come up with a score.
Assuming that the pages are edited in groups of dates i think we can dump in to excel a month and count so we have something like
1/2019 - 10
2/2019 - 37
3/2019 - 0
so in the excel output we have
domain | http response | found sitemap | page count from sitemap | months for the last 24.... (make a column for each month for the last 24 months)
one thing to make sure you got in the sitemap is that sometimes sitemaps are nested. you will need to follow the link if it leads to another sitemap for that section. for example using google [login to view URL] this main map links to sub maps.
if the domain redirects or fails we want to log that I think we can log the http response codes for this so something like
2xx - fine
4xx - failed
5xx - failed
3xx- redirect, then log the redirect name it sends back
app should run on windows. and would be cool if we can have config file to put the path to the csv of domains, and maybe some thread count so we can set the performance characteristic of this. We will be running this on some big lists of domains so it might be good to make sure its able to handle running a big list like 10K in some controlled threaded fashion. also we should think about a timeout after a minute or something reasonable so the app doesnt freeze for a dead or bad url?
I found this lib which might help you if you needed it - [login to view URL]
29 freelanceri licitează în medie 301$ pentru acest proiect
First of all thank you for an excellent description! I can provide you a Scrapy (Python) based web scraping tool that will read a list of domain from a CSV file and allow you to change number of threads to use. It wil Mai multe
Hello Sir, I am expert who understands the value of time. I pride myself in my attention to detail. I am very hard working and aim to deliver in less time than quoted. I want to make you, my employer happy without cha Mai multe
Hello, Hi, Nice to meet you! I have read your requirements carefully and I am very interesting for your project. I am confident of this project as I'm a professional Python expert with over 5 years of experience. gith Mai multe
Hello. As a web scraping and Python and Django expert, I am glad to place the bid on your project. As you can see in my profile, I am fully experienced and lots of skills in web development. I want to discuss more via Mai multe
I can write python app for Windows with GUI using the mentioned python lib and multi threading . It will save results in csv as per description . I have 6+ experience in Python .
Hi, I have gone through your requirement to scrape lots of websites. I am EXPERT in building scraping tools /scripts. Hence, I can SURELY work on your project. I am having 4 YEARS of EXPERIENCE in developing PHP-PYTHON Mai multe
Hi there, I like that project. Im a Python developer with lots of knowledge in Web Scraping and Python Libraries. I do not exactly understand how you are going to calculate the 'activity score' from sitemaps but Im aw Mai multe
Hi I offer a wide range of services, including , Python , Web Scraping I can create and deliver the project as per the information.I have skilled, expert programmers I'm very excited to assist you in making your Mai multe
Hello, dear Customer. I am interested in your project and feel confident after reading your project description. Please contact me so that we can discuss it in more detail. Looking forward to hearing from you soon. Tha Mai multe
Hi there. Thank you for your posting. I have read your posting carefully and I would like to work with you. I hope we can have a detailed discussion by chat and share our idea. Regards.
Hi. I am ready to write your project Write apps on your demand in many languages (Visual Basic, VBA, VBS, .NET, C#, JS, Python, Java, PowerShell) Write database apps including many db formats: MS Access, MS SQL, SQL Mai multe
hi i saw you your post regarding web scraping.i have done many similar tasks like [login to view URL] config .i will give you excel file and you can set parameters on that file(like file name,site name some filters maybe) i can m Mai multe
Hi. I am a freelancer working in 24/7 service. I have much experience for 6+ years in Scraping Field, I will produce the output in any format you desire. I think this project is for me and I can finish it for a short t Mai multe
It's a piece of cake for me. Hi, sir I am a Web scraping expert. I have many experiences like your project. I am very confident to complete this task in a few days. I am ready for your project and I will provide the b Mai multe
Hello. I am an expert of python scrape.I am very interested in your job and I can do it quickly. Please contact me. Thank you.
Hi, I'm Chunzuo. I read your description and understand what u need. I have already completed python task. Please contact with me to discuss more.
Greetings, This project is very similar to one that I recently completed, using Python to test a list of domain names and recording the results of their DNS lookups and HTTP server response status including redirects f Mai multe
Hi! there. ⭐Thanks⭐ good post. I am web scrapper. I am ready in any language such as PHP, Python. I have rich experience in scraping many sites. I want to meet you on the chatting. Regards.
I can start your project immediately. I can provide full-time communication and work your time-zone. If you give me a chance to serve you, I will provide a high quality product within the deadline. Best Regards