Talks, tutorials, articles and interviews
-
OWler: preliminary results for building a collaborative open web crawler, Dinzinger and al., OSSYM 2023 https://ca-roll.github.io/downloads/owler.pdf
-
Presentation: StormCrawler and OpenSearch (Feb 2023) https://www.youtube.com/watch?v=azHYI9pnjos
-
Crawling the German Health Web: Exploratory Study and Graph Analysis, Zowalla R, Wetter T, Pfeifer D. J Med Internet Res 2020;22(7):e17853 https://www.jmir.org/2020/7/e17853
-
Tutorial: StormCrawler 1.16 + Elasticsearch 7.5.0 https://youtu.be/8kpJLPdhvLw
-
StormCrawler open source web crawler strengthened by Elasticsearch, Kibana https://www.elastic.co/blog/stormcrawler-open-source-web-crawler-strengthened-by-elasticsearch-kibana
-
Harvesting Online Health Information, Richard Zowalla slides
-
Tutorial: StormCrawler 1.10 + Apache SOLR 4.7.0 https://youtu.be/F8nvGj03XLo
-
Patent-Crawler: A Web Crawler to Gather Virtual Patent Marking Information, Etienne Orliac, l’Université de Lausanne/École Polytechnique Fédérale de Lausanne (UNIL/EPFL) slides - video
-
DigitalPebble’s Blog: Crawl dynamic content with Selenium and StormCrawler https://digitalpebble.blogspot.co.uk/2017/04/crawl-dynamic-content-with-selenium-and.html
-
The Battle of the Crawlers: Apache Nutch vs. StormCrawler https://dzone.com/articles/the-battle-of-the-crawlers-apache-nutch-vs-stormcr
-
Tutorial: StormCrawler + Elasticsearch + Kibana https://digitalpebble.blogspot.co.uk/2017/04/video-tutorial-stormcrawler.html
-
Q&A with InfoQ https://www.infoq.com/news/2016/12/nioche-stormcrawler-web-crawler
-
DigitalPebble’s Blog: Index the web with StormCrawler (revisited) https://digitalpebble.blogspot.co.uk/2016/09/index-web-with-stormcrawler-revisited.html
-
DigitalPebble’s Blog: Index the web with AWS CloudSearch https://digitalpebble.blogspot.co.uk/2015/09/index-web-with-aws-cloudsearch.html
-
Low latency scalable web crawling on Apache Storm slides - video, by Julien Nioche. BerlinBuzzwords 2015
-
Storm Crawler: A real-time distributed web crawling and monitoring framework slides, by Jake Dodd - Ontopic, ApacheCon North America 2015
-
A quick introduction to Storm Crawler slides, by Julien Nioche. ApacheCon Europe, Budapest, Nov 2014
-
StormCrawler in the wild slides, by Jake Dodd - Ontopic, ApacheCon Europe, Budapest, Nov 2014
Drop us a line at dev@stormcrawler.apache.org if you want to be added to this page.