Can I use Python to index URLs and documents?

Yes. Below is a sample Python script that can be used to index URLs or documents (filesystem can be specified only when using the same server as SearchBlox).

Refer to the Developers page for parameters that can be passed through the XML message: http://www.searchblox.com/developers/api.

@@@
import urllib2

url = "http://localhost:8080/searchblox/api/rest/add"
xml = """<?xml version="1.0" encoding="utf-8"?>
<searchblox apikey="E7E4792C64DC98B141C336575FEBE571">
<document colname="custom" location="http://www.searchblox.com/">
<title boost="1">SearchBlox Product Features</title>
<keywords boost="1">SearchBlox, Solr, Lucene, Faceted Search</keywords>
<content boost="1"> This content overrides the content from the document.</content>
<description boost="1">SearchBlox Content Search Features</description>
<lastmodified>07 May 2005 06:19:42 GMT</lastmodified>
<size>44244</size>
<alpha>Features</alpha>
<contenttype>HTML</contenttype>
<category>SearchBlox/Features</category>
<category>SearchBlox/Product</category>
<meta name="company">Searchblox Software Inc</meta>
<meta name="location">Richmond</meta>
</document>
</searchblox>"""

req = urllib2.Request(url)
req.add_header('Content-Type', 'application/xml')
res = urllib2.urlopen(req, xml)

response = res.readlines()
for line in response:
print line

@@@

Have more questions? Submit a request

Comments