Skip to content Skip to sidebar Skip to footer

Scrapy: Csv Output Without Header

When I use the command scrapy crawl -o , I get the output of my Item dictionary with headers. This is good. However, I would like scrapy to omit

Solution 1:

There is include_headers_line=True in CsvItemExporter but I don't know how to use it directly. http://doc.scrapy.org/en/latest/topics/exporters.html#csvitemexporter

But you can create own exporter with include_headers_line=False in file exporters.py (in the same folder as settings.py and items.py)

from scrapy.exporters import CsvItemExporter


classHeadlessCsvItemExporter(CsvItemExporter):

    def__init__(self, *args, **kwargs):
        kwargs['include_headers_line'] = Falsesuper(HeadlessCsvItemExporter, self).__init__(*args, **kwargs)

Then you have to set this exporter in settings.py

FEED_EXPORTERS = {
    'csv': 'your_project_name.exporters.HeadlessCsvItemExporter',
}

And now scrapy should write csv file without headers.

scrapy crawl <project> -o <filename.csv>

Or you can set

FEED_EXPORTERS = {
    'headless': 'your_project_name.exporters.HeadlessCsvItemExporter',
}

and get csv without headers only when you use -t headless

scrapy crawl <project> -o <filename.csv> -t headless

ps. don't forget to use your project name in place of your_project_name in setttings.py


EDIT:

Now exporter skips headers only if file is not empty (if file.tell() > 0)

from scrapy.exporters import CsvItemExporter


classHeadlessCsvItemExporter(CsvItemExporter):

    def__init__(self, *args, **kwargs):

        # args[0] is (opened) file handler# if file is not empty then skip headersif args[0].tell() > 0:
            kwargs['include_headers_line'] = Falsesuper(HeadlessCsvItemExporter, self).__init__(*args, **kwargs)

Solution 2:

The following settings.py worked for me.

FEEDS = {
    '<filename.csv>': {
        'format': 'csv',
        'item_export_kwargs': {
           'include_headers_line': False,
        },
    }
}

Post a Comment for "Scrapy: Csv Output Without Header"