Python Csv Writer
I have a csv that looks like this: HA-MASTER,CategoryID 38231-S04-A00,14 39790-S10-A03,14 38231-S04-A00,15 39790-S10-A03,15 38231-S04-A00,16 39790-S10-A03,16 38231-S04-A00,17 39790
Solution 1:
Yes there is a way:
import csv
defaddRowToDict(row):
global myDict
key=row[1]
if key in myDict.keys():
#append values if entry already exists
myDict[key].append(row[0])
else:
#create entry
myDict[key]=[row[1],row[0]]
global myDict
myDict=dict()
inFile='C:/Users/xxx/Desktop/pythons/test.csv'
outFile='C:/Users/xxx/Desktop/pythons/testOut.csv'withopen(inFile, 'r') as f:
reader = csv.reader(f)
ignore=Truefor row in reader:
if ignore:
#ignore first row
ignore=Falseelse:
#add entry to dict
addRowToDict(row)
withopen(outFile,'w') as f:
writer = csv.writer(f)
#write everything to file
writer.writerows(myDict.itervalues())
Just edit inFile and outFile
Solution 2:
This is pretty trivial using a dictionary of lists (Python 2.7 solution):
#!/usr/bin/env pythonimport fileinput
categories={}
for line in fileinput.input():
# Skip the first line in the file (assuming it is a header).if fileinput.isfirstline():
continue# Split the input line into two fields.
ha_master, cat_id = line.strip().split(',')
# If the given category id is NOT already in the dictionary# add a new empty listifnot cat_id in categories:
categories[cat_id]=[]
# Append a new value to the category.
categories[cat_id].append(ha_master)
# Iterate over all category IDs and lists. Use ','.join() to# to output a comma separate list from an Python list.for k,v in categories.iteritems():
print'%s,%s' %(k,','.join(v))
Solution 3:
I would read in the entire file, create a dictionary where the key is the ID and the value is a list of the other data.
data = {}
withopen("test.csv", "r") as f:
for line in f:
temp = line.rstrip().split(',')
iflen(temp[0].split('-')) == 3: # => specific format that ignores the header...if temp[1] in data:
data[temp[1]].append(temp[0])
else:
data[temp[1]] = [temp[0]]
withopen("output.csv", "w+") as f:
forid, datum in data.iteritems():
f.write("{},{}\n".format(id, ','.join(datum)))
Solution 4:
Use pandas!
importpandascsv_data= pandas.read_csv('path/to/csv/file')
use_this = csv_data.group_by('CategoryID').values
You will get a list with everything you want, now you just have to format it.
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
Cheers.
Solution 5:
I see many beautiful answers have come up while I was trying it, but I'll post mine as well.
import re
csvIN = open('your csv file','r')
csvOUT = open('out.csv','w')
cat = dict()
for line in csvIN:
line = line.rstrip()
if not re.search('^[0-9]+',line): continue
ham, cid = line.split(',')
if cat.get(cid,False):
cat[cid] = cat[cid] + ',' + ham
else:
cat[cid] = ham
for i in sorted(cat):
csvOUT.write(i + ',' + cat[i] + '\n')
Post a Comment for "Python Csv Writer"