<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>meier-online &#187; fetch</title>
	<atom:link href="http://meier-online.com/tag/fetch/feed/" rel="self" type="application/rss+xml" />
	<link>http://meier-online.com</link>
	<description>Der Blog von Karsten Meier</description>
	<lastBuildDate>Sun, 09 Oct 2011 13:27:57 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>How to fetch remote data with the App Engine</title>
		<link>http://meier-online.com/en/2008/08/fremde-daten-mit-der-appengine-verarbeiten/</link>
		<comments>http://meier-online.com/en/2008/08/fremde-daten-mit-der-appengine-verarbeiten/#comments</comments>
		<pubDate>Wed, 06 Aug 2008 15:36:00 +0000</pubDate>
		<dc:creator>meier</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[AppEngine]]></category>
		<category><![CDATA[fetch]]></category>
		<category><![CDATA[howto]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[SSL]]></category>

		<guid isPermaLink="false">http://meier-online.com/blog/?p=164</guid>
		<description><![CDATA[You can use the App Engine to retrieve web pages or other information from foreign servers and process them in your python program. This must obviously be limited, because the abuse potential is very high. Therefore you can not call all possible Internet services. You can retrieve only &#8220;web pages&#8221; or everything that a Web [...]]]></description>
			<content:encoded><![CDATA[<p>You can use the App Engine to retrieve web pages or other information from foreign servers and process them in your python program.</p>
<p>This must obviously be limited, because the abuse potential is very high. Therefore you can not call all possible Internet services. You can retrieve only &#8220;web pages&#8221; or  everything that a Web server is offering on port 80 or 443.</p>
<p><span id="more-164"></span>The module urlfetch implements it. A simple usage is:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">response = urlfetch.<span style="color: black;">fetch</span><span style="color: black;">&#40;</span>url<span style="color: black;">&#41;</span>
content = response.<span style="color: black;">content</span></pre></div></div>

<p>If everything goes well, this is enough. So far, this is very convienent. But in most real world scenarios, you need to handle all kind of error conditions. Like in this skeleton:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">    <span style="color: #ff7700;font-weight:bold;">try</span>:
        response = urlfetch.<span style="color: black;">fetch</span><span style="color: black;">&#40;</span>jadurl<span style="color: black;">&#41;</span>
        <span style="color: #ff7700;font-weight:bold;">if</span> response.<span style="color: black;">status_code</span> == <span style="color: #ff4500;">200</span>:
            <span style="color: #008000;">self</span>.<span style="color: black;">xxheaders</span> = response.<span style="color: black;">headers</span>
            <span style="color: #008000;">self</span>.<span style="color: black;">xxcontent</span> = response.<span style="color: black;">content</span>
            <span style="color: #008000;">self</span>.<span style="color: black;">xxFetched</span> = <span style="color: #008000;">True</span>
        <span style="color: #ff7700;font-weight:bold;">elif</span> response.<span style="color: black;">status_code</span> == <span style="color: #ff4500;">404</span>:
            <span style="color: #008000;">self</span>.<span style="color: black;">errortext</span> = <span style="color: #483d8b;">&quot;File not found&quot;</span>
            <span style="color: #008000;">self</span>.<span style="color: black;">xxcontent</span> = <span style="color: #483d8b;">&quot;not found&quot;</span>
        <span style="color: #ff7700;font-weight:bold;">else</span>:
            <span style="color: #008000;">self</span>.<span style="color: black;">errortext</span> = <span style="color: #483d8b;">&quot;Bad Response Code&quot;</span>
            <span style="color: #008000;">self</span>.<span style="color: black;">xxcontent</span> = response.<span style="color: black;">status_code</span>
    <span style="color: #ff7700;font-weight:bold;">except</span> InvalidURLError:
        <span style="color: #008000;">self</span>.<span style="color: black;">errortext</span> = <span style="color: #483d8b;">&quot;Invalid URL&quot;</span>
    <span style="color: #ff7700;font-weight:bold;">except</span> DownloadError:
        <span style="color: #008000;">self</span>.<span style="color: black;">errortext</span> = <span style="color: #483d8b;">&quot;Error downloading file&quot;</span>
    <span style="color: #ff7700;font-weight:bold;">except</span> ResponseTooLargeError:
        <span style="color: #008000;">self</span>.<span style="color: black;">errortext</span> = <span style="color: #483d8b;">&quot;File to large&quot;</span></pre></div></div>

<h3>HTTP-Headers</h3>
<p style="margin-bottom: 0cm;">You can also access to the HTTP headers. For example, if you want to access the mime type:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">response.<span style="color: black;">headers</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'content-type'</span><span style="color: black;">&#93;</span></pre></div></div>

<p>In my particular use case I am also interested whether a header is sent twice. It seems that you can not get this information.</p>
<h3>SSL</h3>
<p>You can also access SSL-protected pages with &#8220;https: / / . Unfortunately,  according to the documentation as of version 1.1, the certificate is not checked. This means, the increased security is not there. This restricts the possible application range.</p>
]]></content:encoded>
			<wfw:commentRss>http://meier-online.com/en/2008/08/fremde-daten-mit-der-appengine-verarbeiten/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

