How do I index a Google Commerce Search feed into SearchBlox?

You can use the following python script to index your Google Commerce search feed.

First, create a Custom Collection within SearchBlox called googlefeed, and update the API Key within the script.

The feed filename is example_feed.txt, and the sample can be obtained from https://support.google.com/merchants/answer/160567

Remove the fields that you don't index by commenting out the lines. Also, that no columns are blank within the feed file.

@@@

import csv
import urllib2

URL = "http://localhost:8080/searchblox/api/rest/add"

with open('/products/example_feed.txt', 'rb') as f:
reader = csv.reader(f)
counter = 0
for row in reader:
counter = counter + 1
xml = """<?xml version="1.0" encoding="utf-8"?>
<searchblox apikey="57836EABBDF80CCA0632AF4D8D0A6DE3"><document colname="googlefeed" location=" """+row[1]+""" ">
<uid>"""+str(counter)+"""</uid>
<title>"""+row[0]+"""</title>
<description>"""+row[2]+"""</description>
<content>"""+row[0] + " - " +row[2] +"""</content>
<meta name="productid">"""+row[3]+"""</meta>
<meta name="condition">"""+row[4]+"""</meta>
<meta name="price">"""+str(row[5].replace('USD',''))+"""</meta>
<meta name="availability">"""+row[6]+"""</meta>
<meta name="image">"""+row[7]+"""</meta>
<meta name="shipping">"""+row[8]+"""</meta>
<meta name="weight">"""+row[9]+"""</meta>
<meta name="gtin">"""+row[10]+"""</meta>
<meta name="brand">"""+str(row[11])+"""</meta>
<meta name="mpn">"""+row[12]+"""</meta>
<meta name="product_cat">"""+row[13]+"""</meta>
<meta name="product_type">"""+row[14]+"""</meta>
<meta name="add_image_link">"""+row[15]+"""</meta>
<meta name="color">"""+row[16]+"""</meta>
<meta name="size">"""+row[17]+"""</meta>
<meta name="gender">"""+row[18]+"""</meta>
<meta name="age_group">"""+row[19]+"""</meta>
<meta name="item_group_id">"""+row[20]+"""</meta>
<meta name="sales_price">"""+row[21]+"""</meta>
<meta name="sale_date">"""+row[22]+"""</meta>
</document></searchblox>"""

req = urllib2.Request(url)
req.add_header('Content-Type', 'application/xml')
res = urllib2.urlopen(req, xml)

response = res.readlines()
for line in response:
print xml

@@@

Have more questions? Submit a request

Comments