Welcome all who are reading this article. I was given a task of creating a parser (spider) with the Scrapy library and parsing FTP server with data. The parser had to find lists of files on the server and handle each file separately depending on the requirement to the parser.
When dealing with one of our projects (LookSMI media monitoring platform) we have to handle the huge volume of data – and its quantity is constantly growing. At the same time, we must run quick searches with smart rules. In this article I'll explain how we have achieved required performance.