Scrapy: Csv Output Without Header
When I use the command scrapy crawl -o , I get the output of my Item dictionary with headers. This is good. However, I would like scrapy to omit
Solution 1:
There is include_headers_line=True
in CsvItemExporter
but I don't know how to use it directly. http://doc.scrapy.org/en/latest/topics/exporters.html#csvitemexporter
But you can create own exporter with include_headers_line=False
in file exporters.py
(in the same folder as settings.py
and items.py
)
from scrapy.exporters import CsvItemExporter
classHeadlessCsvItemExporter(CsvItemExporter):
def__init__(self, *args, **kwargs):
kwargs['include_headers_line'] = Falsesuper(HeadlessCsvItemExporter, self).__init__(*args, **kwargs)
Then you have to set this exporter in settings.py
FEED_EXPORTERS = {
'csv': 'your_project_name.exporters.HeadlessCsvItemExporter',
}
And now scrapy should write csv file without headers.
scrapy crawl <project> -o <filename.csv>
Or you can set
FEED_EXPORTERS = {
'headless': 'your_project_name.exporters.HeadlessCsvItemExporter',
}
and get csv without headers only when you use -t headless
scrapy crawl <project> -o <filename.csv> -t headless
ps. don't forget to use your project name in place of your_project_name
in setttings.py
EDIT:
Now exporter skips headers only if file is not empty (if file.tell() > 0
)
from scrapy.exporters import CsvItemExporter
classHeadlessCsvItemExporter(CsvItemExporter):
def__init__(self, *args, **kwargs):
# args[0] is (opened) file handler# if file is not empty then skip headersif args[0].tell() > 0:
kwargs['include_headers_line'] = Falsesuper(HeadlessCsvItemExporter, self).__init__(*args, **kwargs)
Solution 2:
The following settings.py worked for me.
FEEDS = {
'<filename.csv>': {
'format': 'csv',
'item_export_kwargs': {
'include_headers_line': False,
},
}
}
Post a Comment for "Scrapy: Csv Output Without Header"