<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jim's Random Notes &#187; Internet</title>
	<atom:link href="http://blog.mischel.com/category/internet/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.mischel.com</link>
	<description></description>
	<lastBuildDate>Wed, 21 Jul 2010 21:16:34 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>But isn&#8217;t that what the Web is for?</title>
		<link>http://blog.mischel.com/2010/07/19/but-isnt-that-what-the-web-is-for/</link>
		<comments>http://blog.mischel.com/2010/07/19/but-isnt-that-what-the-web-is-for/#comments</comments>
		<pubDate>Mon, 19 Jul 2010 21:28:46 +0000</pubDate>
		<dc:creator>Jim</dc:creator>
				<category><![CDATA[Idiocy]]></category>
		<category><![CDATA[Internet]]></category>

		<guid isPermaLink="false">http://blog.mischel.com/?p=883</guid>
		<description><![CDATA[The Terms of Use for the site yobi.tv includes the following (the emphasis is mine):

8. RESTRICTIONS ON USE
You may use this Site only for purposes expressly permitted by this Site. You may not use this Site for any other purpose, including any commercial purpose, without YOBI&#8217;s express prior written consent. For example, you may not [...]]]></description>
			<content:encoded><![CDATA[<p>The Terms of Use for the site yobi.tv includes the following (the emphasis is mine):</p>
<blockquote>
<div id="_mcePaste"><strong>8. RESTRICTIONS ON USE</strong></div>
<div id="_mcePaste">You may use this Site only for purposes expressly permitted by this Site. You may not use this Site for any other purpose, including any commercial purpose, without YOBI&#8217;s express prior written consent. For example, you may not (and may not authorize any other party to) (i) co-brand this Site, or (ii) frame this Site, or <em><span style="text-decoration: underline;">(iii) hyper-link to this Site</span></em>, without the express prior written permission of an authorized representative of YOBI. For purposes of these Terms of Use, “co-branding” means to display a name, logo, trademark, or other means of attribution or identification of any party in such a manner as is reasonably likely to give a user the impression that such other party has the right to display, publish, or distribute this Site or content accessible within this Site. You agree to cooperate with YOBI in causing any unauthorized co-branding, framing or hyper-linking immediately to cease.</div>
</blockquote>
<p>Far be it from me to violate their Terms, which is why the name of their site, above, is not hyperlinked.</p>
<p>I thought this particular idiocy had been eliminated years ago.  If you don&#8217;t want people to link to you, why the heck are you on the Web at all?  I think somebody needs to rein in the lawyers again.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mischel.com/2010/07/19/but-isnt-that-what-the-web-is-for/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Captcha this!</title>
		<link>http://blog.mischel.com/2010/06/21/captcha-this/</link>
		<comments>http://blog.mischel.com/2010/06/21/captcha-this/#comments</comments>
		<pubDate>Mon, 21 Jun 2010 18:27:38 +0000</pubDate>
		<dc:creator>Jim</dc:creator>
				<category><![CDATA[Idiocy]]></category>
		<category><![CDATA[Internet]]></category>

		<guid isPermaLink="false">http://blog.mischel.com/?p=875</guid>
		<description><![CDATA[I have trouble with a lot of those &#8220;captcha&#8221; systems.  When the letters are oddly curved, run together, or otherwise obscured, I have a tough time.  I thought this one wouldn&#8217;t be so tough, though.  Until I saw the note, &#8220;it is case sensitive.&#8221;

The &#8220;y&#8221;, &#8220;e&#8221;, &#8220;4&#8243;, and &#8220;u&#8221; are simple.  But is that &#8220;w&#8221; or [...]]]></description>
			<content:encoded><![CDATA[<p>I have trouble with a lot of those &#8220;captcha&#8221; systems.  When the letters are oddly curved, run together, or otherwise obscured, I have a tough time.  I thought this one wouldn&#8217;t be so tough, though.  Until I saw the note, &#8220;it is case sensitive.&#8221;</p>
<p><a href="http://blog.mischel.com/wp-content/uploads/2010/06/passcode.jpg"><img class="aligncenter size-full wp-image-876" title="passcode" src="http://blog.mischel.com/wp-content/uploads/2010/06/passcode.jpg" alt="" width="496" height="146" /></a></p>
<p>The &#8220;y&#8221;, &#8220;e&#8221;, &#8220;4&#8243;, and &#8220;u&#8221; are simple.  But is that &#8220;w&#8221; or &#8220;W&#8221;?  &#8220;z&#8221; or &#8220;Z&#8221;?</p>
<p>I wasn&#8217;t going to send a comment.  And, seeing as how I can&#8217;t reliably determine what to type here, I wouldn&#8217;t even attempt it.  Knowing my luck, I&#8217;d get into an infinite loop of failures and ambiguous prompts.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mischel.com/2010/06/21/captcha-this/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Is there a syndication format standard for media?</title>
		<link>http://blog.mischel.com/2010/06/03/is-there-a-syndication-format-standard-for-media/</link>
		<comments>http://blog.mischel.com/2010/06/03/is-there-a-syndication-format-standard-for-media/#comments</comments>
		<pubDate>Thu, 03 Jun 2010 19:04:44 +0000</pubDate>
		<dc:creator>Jim</dc:creator>
				<category><![CDATA[Internet]]></category>

		<guid isPermaLink="false">http://blog.mischel.com/?p=866</guid>
		<description><![CDATA[At work, we&#8217;re trying to make our media firehose available to other developers.  One thought is to publish a syndication feed through Pubsubhubbub.  I can easily do that, but I don&#8217;t know which format to use.
Pubsubhubbub was originally designed to support the Atom Syndication Format.  But there&#8217;s no widely accepted standard for publishing media information [...]]]></description>
			<content:encoded><![CDATA[<p>At work, we&#8217;re trying to make our media firehose available to other developers.  One thought is to publish a syndication feed through <a href="http://en.wikipedia.org/wiki/PubSubHubbub">Pubsubhubbub</a>.  I can easily do that, but I don&#8217;t know which format to use.</p>
<p>Pubsubhubbub was originally designed to support the <a href="http://tools.ietf.org/html/rfc4287">Atom Syndication Format</a>.  But there&#8217;s no widely accepted standard for publishing media information with Atom.  It looks like most sites that publish media use <a href="http://www.rssboard.org/rss-specification">RSS 2.0</a> and the <a href="http://www.rssboard.org/media-rss">Media RSS</a> extensions.</p>
<p>I touched on the <a href="http://mischel.com/diary/2004/08/18.htm">development of syndication formats</a> in a 2004 blog entry.  Atom &#8220;won&#8221; by becoming a standard, but RSS (mostly RSS 2.0) is so prevalent that it&#8217;s unlikely to be pushed out any time soon.</p>
<p>The developers of Pubsubhubbub <a href="http://code.google.com/p/pubsubhubbub/wiki/RssFeeds">included support for RSS</a> in July of last year, and the <a href="http://pubsubhubbub.googlecode.com/svn/trunk/pubsubhubbub-core-0.3.html">latest specification (0.3)</a> includes support.  This is both good and bad:  good because now all those sites that use RSS 2.0 can participate in Pubsubhubbub, and bad because those of us building readers still have to deal with two competing formats.</p>
<p>Also on the bad side, and especially pertinent to me, is the lack of any kind of media extensions for Atom.  There&#8217;s been some chatter in the past about using the Media RSS extensions inside of Atom, but it appears that nothing ever came of that.  Atom has an &#8220;enclosure&#8221; element that adds <em>some</em> media capability, but it&#8217;s pretty minimal.  The <a href="http://datatracker.ietf.org/doc/rfc5023/">Atom Publishing Protocol</a> specification allows for media, but not to any great extent.   There&#8217;s also a proposed standard for <a href="http://martin.atkins.me.uk/specs/atommedia">Atom Media Extensions</a>, although I find no mention of it on the IETF&#8217;s site.</p>
<p>So I&#8217;m left with this choice:  publish RSS 2.0 with Media RSS extensions (both non-standard), or publish a severely crippled Atom standard standard document.  Since it&#8217;s likely that RSS 2.0 will continue to thrive and it looks like there are no standard media extensions to Atom, it looks like I&#8217;m stuck with RSS.  Unfortunate, but that&#8217;s the way it goes.</p>
<p>We love standards.  That&#8217;s why we have so many of them.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mischel.com/2010/06/03/is-there-a-syndication-format-standard-for-media/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Podly.TV is alive!</title>
		<link>http://blog.mischel.com/2010/05/27/podly-tv-is-alive/</link>
		<comments>http://blog.mischel.com/2010/05/27/podly-tv-is-alive/#comments</comments>
		<pubDate>Thu, 27 May 2010 15:05:08 +0000</pubDate>
		<dc:creator>Jim</dc:creator>
				<category><![CDATA[Internet]]></category>

		<guid isPermaLink="false">http://blog.mischel.com/?p=858</guid>
		<description><![CDATA[It&#8217;s been a long road.  Back in January of 2007, David Stafford and I came up with the idea of writing a media search engine.  We thought it&#8217;d take maybe a year.
We actually had something in a little more than a year, but it wasn&#8217;t very interesting.  It worked, but nobody would have been very [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s been a long road.  Back in January of 2007, <a href="http://davidst.com/">David Stafford</a> and I came up with the idea of writing a media search engine.  We thought it&#8217;d take maybe a year.</p>
<p>We actually had something in a little more than a year, but it wasn&#8217;t very interesting.  It <em>worked</em>, but nobody would have been very interested in it if we had released it to the world.</p>
<p>By then we had grown to four people.  We stuck with it, improved our crawler and indexing technology, and released two or three other incarnations to very limited audiences.  Those attempts, too, were less than successful, but they gave us a lot of valuable information about what people like (and don&#8217;t like), and how users want to view online video.</p>
<p>It&#8217;s been a long road, and sometimes a bit discouraging, but we&#8217;re finally able to present an early version of our new product, <a href="http://beta.podly.tv">Podly.TV</a>.</p>
<p>Take it for a test drive.  Browse the channels.  If you don&#8217;t see anything you like, do a search and create your own channels.  It&#8217;s totally free.  Anybody can sign up and create a personal channel list.  We&#8217;re adding new videos constantly, and new video sources on a regular basis.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mischel.com/2010/05/27/podly-tv-is-alive/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Why Google will win</title>
		<link>http://blog.mischel.com/2010/05/24/why-google-will-win/</link>
		<comments>http://blog.mischel.com/2010/05/24/why-google-will-win/#comments</comments>
		<pubDate>Mon, 24 May 2010 18:21:23 +0000</pubDate>
		<dc:creator>Jim</dc:creator>
				<category><![CDATA[Computers]]></category>
		<category><![CDATA[Internet]]></category>

		<guid isPermaLink="false">http://blog.mischel.com/?p=853</guid>
		<description><![CDATA[I don&#8217;t know who&#8217;s calling the shots over there at Google, but they&#8217;re absolutely brilliant.
Google&#8217;s technology is impressive, no doubt.  They&#8217;ve come a long way in the 12 years or so since two college kids named Sergi Brin and Larry Page came up with a way to greatly improve the quality of Web search [...]]]></description>
			<content:encoded><![CDATA[<p>I don&#8217;t know who&#8217;s calling the shots over there at Google, but they&#8217;re absolutely brilliant.</p>
<p>Google&#8217;s technology is impressive, no doubt.  They&#8217;ve come a long way in the 12 years or so since two college kids named Sergi Brin and Larry Page came up with a way to greatly improve the quality of Web search results.  They met quite a bit of resistance when they went looking for funding to build a company.  Everybody thought that Yahoo owned search, and nobody thought you could make money with search.  &#8220;You&#8217;re going to spend millions of dollars to build a phone book of the Web?  How will you make money?&#8221;</p>
<p>Around the same time, there was a small group at Microsoft who wanted to build search.  Microsoft&#8217;s corporate leaders shut that down pretty quickly, for much the same reason:  &#8220;there&#8217;s no money in search.&#8221;  In addition, and perhaps more importantly, Microsoft hadn&#8217;t really embraced the Internet.  Sure, Internet Explorer was in ascendance, mostly due to Netscape&#8217;s incompetence, and other parts of the company were making noise about using the Web, but at heart Microsoft remained a shrink-wrap software company.  Their business was selling Windows and Office.  They embraced the Internet to the extent required to sell those products.</p>
<p>Microsoft eventually embraced search, first grudgingly&#8211;&#8221;It&#8217;s something we have to provide&#8221;&#8211;and finally, after realizing that there was money to be made, by committing serious resources.  But by then it was too late.</p>
<p>It was too late because Google had figured out how to make money with search: first by displaying advertisements on search pages, then with Adwords, Adsense, and other cooperative advertising programs.  Google was transformed from a search engine with some incredibly impressive technology into an advertising company that understands how to make billions of dollars a few pennies at a time.</p>
<p>And Google <em>is</em> an advertising company.  Make no mistake.  Google is in the business of placing ads on your screen, and doing so in a manner that makes you more likely to click on the ad.  That means making them as relevant as possible and walking that fine line between visiblity and unobtrusiveness.  It also means getting their ads <em>everywhere</em>, and everything that Google does furthers that goal, directly or indirectly.</p>
<p>Google&#8217;s technology has two jobs:  deliver ads, and to increase their audience.  I know very little about how they deliver ads&#8211;that&#8217;s their proprietary process and, one might argue, the heart of their business.  But they&#8217;re transparent about how they increase their audience.  They provide arguably the best results of any general search engine available.  With YouTube, they dominate Web video.  They have a whole bunch of other free services and software&#8211;translation tools, Google Chrome Web browser, Google Maps, Google Earth, Google Books, Patent Search, Blogger, Mail, SketchUp, Images, and many more&#8211;that make it easier to use the Internet or provide online replacements for traditionally client-bound tools.  By making it easier to use the Internet, they get more people on the Internet.</p>
<p>Google also produces and makes available an incredible amount of program source code that developers can use or include in their products for free.  Just check out <a href="http://code.google.com/intl/en/">Google code</a> sometime.  It&#8217;s full of proven working code that Google paid their employees to develop, and is now giving away for free.  It&#8217;s not that they&#8217;re altruistic.  They know that by making it easier for developers to create quality Web sites, their audience is growing.</p>
<p>Two recent (well, one not so recent) developments show Google&#8217;s commitment.  First, the <a href="http://www.google.com/chrome">Chrome Web browser</a>.  This is Google&#8217;s free browser, which is arguably the best on the market today.  One might ask why Google would go to the expense and effort of creating a new browser and then make large parts of its source code available (see the <a href="http://code.google.com/intl/en/chromium/">Chromum</a> project)?  I can&#8217;t say for sure, but here&#8217;s what I think.</p>
<p>I think that Google wants to do things with the Web that other browsers (Internet Explorer, Firefox, Opera, Safari, etc.) don&#8217;t currently support.  Although it&#8217;s often possible for Google to convince the people in control of those browsers to support new features, Google is left waiting for support.  If they control the browser, then Google can start pushing new technologies on their own schedule.</p>
<p>Whatever the reason behind it, Google Chrome is building market share.  It used to be that Microsoft&#8217;s Internet Explorer had 70 to 75% of the browser market, followed by FireFox in the 20 to 25% range, and everybody else was down in the noise.  The most recent numbers I have put IE below 60% for the first time, Firefox still hanging in there around 20 to 25%, and the rest being shared by Opera, Safari, and Chrome.  Except Chrome is taking market share, most of which is coming from Internet Explorer.</p>
<p>The more recent development is Google&#8217;s support of the <a href="http://www.webmproject.org/">WebM project</a>, a high-quality, open, and <em>free</em> video format.  I cannot overemphasize the importance of this development.  WebM combines a <a href="http://blog.mischel.com/2010/04/19/movie-madness/">container format</a> with free video and audio codecs so that anybody can create and distribute video royalty-free without having to worry about patents or other intellectual property concerns.  Google spent something like $100 million to obtain the rights to the VP8 video codec in order to make this possible.  Then they turned around and made it freely available to anybody.  Why?  Because ubiquitous free video gives Google a huge increase in surface area&#8211;a larger audience&#8211;that they can exploit for the purpose of delivering ads.</p>
<p>From the outside, Google&#8217;s business plan really does look as simple as, &#8220;Make the Web easy to use so that we can deliver more ads to more people.&#8221;</p>
<p>In the process, Google is steamrolling over a number of entrenched companies who thought they had it made.  Consider Adobe, whose Flash player is currently The Standard for online video.  Back in 2007, Flash 8 had something like 95% (perhaps higher) penetration.  That is, 95% of computers connected to the Internet had Flash installed.  Why?  Because of YouTube.  When Adobe released Flash version 9, it achieved more than 90% penetration in just a few months, again in large part (perhaps primarily) because YouTube went to Flash 9 for their video.  Adobe <em>owned</em> Web video.</p>
<p>But Adobe dropped the ball.  For reasons I&#8217;ll never understand, Adobe still clings to the idea that Flash is for creating rich Web apps.  The ability to do rich client things in a Web page is cool, and there was a time when Flash was the best way to do it.  But browsers and computers are more capable now.  I know from experience that it&#8217;s now much easier to build rich applications with JavaScript than it ever was with Flash.  And all you need is a modern browser.  There&#8217;s no need to download and install a Flash control to do it.</p>
<p>After Google&#8217;s WebM announcement last week, Adobe made a press release saying that they&#8217;ll support WebM &#8220;in a future version.&#8221;  YouTube will continue to use Flash for low-quality videos.  Starting soon, though, higher quality video will be delivered with WebM.  You have to be blind not to see what&#8217;s coming:  the eventual removal of Flash support on YouTube.  But it&#8217;s already over for Adobe Flash.  They will only see decreasing market share.  And Adobe has nobody to blame but themselves.  They ran into much the same thing clinging to their old .FLV format when the rest of the world was moving to .MP4.  The reason?  They make money by selling very expensive software packages that create video files.  Much like their PDF tools, they give away the reader and charge a lot of money for software that creates the files that their free players read.</p>
<p>With WebM, all that goes away.  There are already FFmpeg patches for WebM, and likely will be some very good free tools.</p>
<p>Microsoft, too, is getting steamrolled by Google.  After Google&#8217;s WebM announcement, Microsoft said that they&#8217;re very excited about the new technology and that Internet Explorer 9 will fully support it as long as the user has installed the proper codec.  If you&#8217;re not familiar with the world of codecs, don&#8217;t feel bad.  Understanding codecs is not something a user should have to do.  Finding and installing the proper codec can be incredibly frustrating and fraught with danger.  If you go looking for a codec for Media Player, for example, you&#8217;ll find yourself confused and in very real danger of inadvertently downloading and installing some malware.</p>
<p>For Microsoft to say, &#8220;as long as the user has installed the proper codec&#8221; is like GM saying that the new car they sell you will be fully functional as long as you find and install a compatible engine.</p>
<p>And don&#8217;t expect Microsoft&#8217;s Media Player to support WebM any time soon. According to Microsoft&#8217;s own <a href="http://support.microsoft.com/kb/316992">Information about the Multimedia file types that Windows Media Player supports</a>, they don&#8217;t even support MP4.  Granted, that article was written two years ago, but it covers Media Player 11, which is the most current version.  That article says, &#8220;You can play back .mp4 media files in Windows Media Player when you install DirectShow-compatible MPEG-4 decoder packs. DirectShow-compatible MPEG-4 decoder packs include the Ligos LSX-MPEG Player and the EnvivioTV.&#8221;  In other words, you have to install a codec made by a third party in order to play a video format that the rest of the world embraced five years ago.</p>
<p>The announcement of WebM is also pushing innovation in another area:  the server.  The day after the WebM announcement, somebody was <a href="http://www.alobbs.com/1386/Streaming_WebM_VP8_One_Day_Later.html">streaming WebM from the Cherokee Web server</a>.  <em>One day!</em>  This has some very interesting ramifications.  An open media format combined with an open Web server (like <a href="http://httpd.apache.org/">Apache</a>) means that a free media server is not far behind.  There goes Adobe&#8217;s <a href="http://www.adobe.com/products/flashmediaserver/">Flash Media Server</a> business.  And quite possibly Microsoft&#8217;s <a href="http://www.microsoft.com/windows/products/winfamily/windowshomeserver/default.mspx">Home Media Server</a>, especially if somebody releases an easy Linux configuration that includes this hypothetical (but soon to be realized) media server, backup and data recovery, and document management.</p>
<p>It&#8217;s interesting to note that Google hasn&#8217;t had to &#8220;target&#8221; any of these companies in order to take them out.  In fact, Google probably isn&#8217;t even interested in &#8220;taking them out.&#8221;  Google is just doing what it needs to do in order to grow the business.  If it means investing hundreds of millions of dollars so that more people will come online to watch video, then so be it.  If Google makes a few pennies every time somebody watches a video online, that hundred million bucks will be returned in short order.</p>
<p>The really funny part here is that both Microsoft and Adobe had to see it coming. It&#8217;s not like Google made a surprise announcement last week:  there was a big splash when they acquired the VP8 technology a few months back, and Google has been telegraphing this move since at least 2007, when they paid $1.5 billion for YouTube.  That kind of investment says, &#8220;We want to own Web video because we think we can make money at it.&#8221;  No, Microsoft and Adobe saw this coming and knew that they were powerless to stop it.  But rather than embrace VP8 and try to find a way to work with it, they clung to their own product plans hoping that some imaginary <a href="http://en.wikipedia.org/wiki/Maginot_Line">Maginot Line</a> would block Google&#8217;s advance.  Adobe, Microsoft, and other companies whose businesses are built on artificial scarcity (selling bits) are living in the past and will continue to see their market share stolen by companies like Google that can provide <em>better</em> products for free.</p>
<p>You&#8217;re going to see this same thing play out all over again in the world of television.  Google recently announced a deal with Intel and Sony that will put <a href="http://www.google.com/tv/">Google TV</a> on Sony television sets.  Today, something like 25% of all new televisions sold are Internet ready.  Google is ready to go there, and not just because it increases the surface area for their Web advertisements, but also because it gives Google a platform from which to launch an assault on the television advertising market ($70 billion annually in the U.S. alone).</p>
<p>Google&#8217;s competition for that market is a handful of old media companies and Madison Avenue advertising firms, both of which have grown fat and complacent.  Sure, they&#8217;ve been hit by Internet advertising over the years, but it&#8217;s been more of a slow leak in a dike rather than a tsunami that overwhelms the entire system.  Those companies probably aren&#8217;t smart enough to see it coming yet, but when they do see Google riding the wave, they&#8217;ll probably all hunker down behind the dike and hope for the best.  And then complain bitterly (read: try to win through litigation) when they discover that they lost the war while they were sitting there with their thumbs up their butts trying to decide if they should do anything.</p>
<p>Remember, you heard it here first.</p>
<p>I&#8217;m not trying to paint Google in a bad light at all.  On the contrary, I have nothing but admiration for them.  They&#8217;re going about their business.  If the entrenched companies can&#8217;t keep up, it&#8217;s not Google&#8217;s fault.  While the old media companies are refining the horse-drawn carriage, Google is hard at work on the V8 engine.  In the process, Google is making all manner of things available to Internet users and developers, and actually <em>encouraging</em> us to build products that leverage the free services that the company offers.  Given the choice between begging for access from the old media companies or accepting the bounty freely offered by Google, I&#8217;ll throw in with Google.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mischel.com/2010/05/24/why-google-will-win/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Movie madness</title>
		<link>http://blog.mischel.com/2010/04/19/movie-madness/</link>
		<comments>http://blog.mischel.com/2010/04/19/movie-madness/#comments</comments>
		<pubDate>Mon, 19 Apr 2010 16:04:30 +0000</pubDate>
		<dc:creator>Jim</dc:creator>
				<category><![CDATA[Computers]]></category>
		<category><![CDATA[Internet]]></category>

		<guid isPermaLink="false">http://blog.mischel.com/?p=842</guid>
		<description><![CDATA[Outside of YouTube, MP4 is probably the most popular video file format available online.  MP4 videos exist inside a container format that&#8217;s also widely referred to as &#8220;the MP4 container format.&#8221;  Honestly, I&#8217;m not sure what the correct name for the format is, but it&#8217;s described by a standard identified as ISO/IEC 14496-12: &#8220;Information technology [...]]]></description>
			<content:encoded><![CDATA[<p>Outside of YouTube, MP4 is probably the most popular video file format available online.  MP4 videos exist inside a container format that&#8217;s also widely referred to as &#8220;the MP4 container format.&#8221;  Honestly, I&#8217;m not sure what the <em>correct</em> name for the format is, but it&#8217;s described by a standard identified as ISO/IEC 14496-12: &#8220;Information technology &#8211; Coding of audio-visual objects  Part 12: ISO base media file format.&#8221;  Yeah, that&#8217;s a mouthful.  Let&#8217;s just call it &#8220;Part 12.&#8221;</p>
<p>The document is freely available online as a PDF, although it can be difficult to find.  I just went searching for it again and couldn&#8217;t find the full version.  If I remember where I found it, I&#8217;ll post a link here.</p>
<p style="padding-left: 30px;">Adobe Flash Player Update 3 (9, 0, 115, 0) and higher can play some MP4 files.  The subset of MP4s that Flash can play is described in <a href="http://www.adobe.com/devnet/flv/pdf/video_file_format_spec_v9.pdf">Video File Format Specification Version 9</a>.  That document gives you an idea of the MPEG-12 file format, although you probably want the full spec. if you&#8217;re implementing a reader.</p>
<p>The file format is quite flexible&#8211;perhaps overly so&#8211;but reasonably easy to parse once you grok the basic structure.  I coded up a quick MP4 reader in a day, and within two days had my web crawler extracting metadata from files I located online.  But then I ran into a problem:  my movie player would sometimes hang when trying to play a file.  The really weird part was that the movie would play fine if I downloaded it first.  It was only when trying to play from online that I experienced the problem.</p>
<p>It didn&#8217;t take long to find the problem.  Or at least <em>part</em> of the problem.</p>
<p>Data in the Part 12 file format is organized in &#8220;boxes.&#8221;  Those boxes contain all manner of information:  a file header, overall movie information, information about the individual tracks, synchronization data for different tracks, etc.  Part 12 describes the overall structure of the file and the contents of the &#8220;moov&#8221; box that contains basic movie metadata (number and types of tracks, duration, codecs required to play the tracks, etc.).</p>
<p>Another box, called &#8220;mdat,&#8221; contains the actual movie data:  the video and audio information that will be played.</p>
<p>In order to start playing a movie, a player must have the metadata.  The player can&#8217;t play the first frame until it knows how to decode that frame.  The movie data, on the other hand, can be delivered relatively slowly:  at whatever the playback speed is.  In other words, playing a movie consists of these steps:</p>
<pre>Read the metadata.
Determine if the movie is playable with this player.
repeat
    Read movie data (audio, visual, etc.) frame
    Render frame
until end of movie</pre>
<p>So it makes sense to organize data in the movie file to facilitate that.  Right?  In fact, the Part 12 document makes two very pertinent recommendations:</p>
<blockquote><p>2)  It is strongly <strong>recommended</strong> that all header boxes be placed first in their container: these boxes are the Movie Header, Track Header, Media Header, and the specific media headers inside the Media Information Box (e.g. the Video Media Header).</p>
<p>8)  It is <strong>recommended</strong> that the progressive download information box be placed as early as possible in files, for maximum utility.</p></blockquote>
<p>The emphasized &#8220;<strong>recommended</strong>&#8221; is in the original document.</p>
<p>There are good reasons for these recommendations, as I discovered in the first problem file I looked at.  In that particular file, the &#8220;mdat&#8221; box, which contains the frame data, is placed at the front of the file:  immediately after the file header.  &#8220;mdat&#8221; is 89 megabytes long.  It&#8217;s followed by the the &#8220;moov&#8221; box that&#8217;s a little less than two megabytes.  A movie player has to download 89 megabytes of stuff before it can get to the metadata that tells the player how to play the movie.  89 megabytes might not sound like much, but at 10 megabits per second (which would be a very fast residential connection here in the U.S.), it&#8217;s a minute and a half.  Nobody&#8217;s going to wait a minute and a half for their video to download.</p>
<p>I suspect that whoever made these movies has no idea that they&#8217;re effectively unplayable over the Internet, and might not even care.  <em>I</em> care, because I&#8217;m not going to download the entire movie just to see if I&#8217;m really interested in watching it.</p>
<p>What surprises me is that video player software doesn&#8217;t recognize this and skip over the movie data to get to the metadata.  The HTTP 1.1 specification makes it very easy to get a partial file.  The movie player should see that the &#8220;mdat&#8221; box comes before &#8220;moov&#8221;, and make another request to get &#8220;moov&#8221;.  It could then go back to &#8220;mdat&#8221; after digesting the metadata.</p>
<p>I wonder how hard it would be to create a tool that fixes those backwards movies . . .</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mischel.com/2010/04/19/movie-madness/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What is dragonwood?</title>
		<link>http://blog.mischel.com/2010/03/09/what-is-dragonwood/</link>
		<comments>http://blog.mischel.com/2010/03/09/what-is-dragonwood/#comments</comments>
		<pubDate>Wed, 10 Mar 2010 00:03:44 +0000</pubDate>
		<dc:creator>Jim</dc:creator>
				<category><![CDATA[Carving]]></category>
		<category><![CDATA[Internet]]></category>

		<guid isPermaLink="false">http://blog.mischel.com/?p=804</guid>
		<description><![CDATA[It&#8217;s rare that I&#8217;m stumped when I try to find something on Google, but this one beat me.  Somebody on the woodcarving forum asked about &#8220;dragonwood.&#8221;  Always curious, I thought I&#8217;d look it up.
Dragonwood appears to be very commonly used for the trunks and larger branches of artificial (silk) trees.  It&#8217;s also commonly used to [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s rare that I&#8217;m stumped when I try to find something on Google, but this one beat me.  Somebody on the woodcarving forum asked about &#8220;dragonwood.&#8221;  Always curious, I thought I&#8217;d look it up.</p>
<p>Dragonwood appears to be very commonly used for the trunks and larger branches of artificial (silk) trees.  It&#8217;s also commonly used to make perches for pet birds, and I gather somewhat less commonly used to make cat trees and cheap furniture.  That&#8217;s all interesting, but I couldn&#8217;t find a picture of a dragonwood tree or anything that gave me the botanical name of the silly thing.  The best I could find is that it grows in Florida.</p>
<p>Somebody else on the forum posted an answer this afternoon, identifying the wood as <em><a href="http://plants.usda.gov/java/profile?symbol=LYFE">Lyonia Ferruginea</a></em> (rusty staggerbrush), a shrub or small tree that grows in Florida, Georgia, and South Carolina.  In case you&#8217;re interested, that person also indicated that it&#8217;s good carving wood.</p>
<p>I&#8217;m really surprised that this one stumped me.  The common name dragonwood (less often, &#8220;dragon wood&#8221;) is used in a lot of places, but I was unable to find a any reference that showed its botanical name.  I figured I could find it just like I can type &#8220;bottle brush tree&#8221; and get the botanical name.  No such luck.</p>
<p>One resource said that &#8220;dragonwood&#8221; was a corruption of the original &#8220;draggin&#8217; wood&#8221;, which describes how they get the wood out of the thicket after it&#8217;s cut.</p>
<p>Hopefully anybody else looking for a description of dragonwood will find this post and not have to wade through a few dozen pages of links to fake plants and parrot cage goodies.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mischel.com/2010/03/09/what-is-dragonwood/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Sniffing network traffic</title>
		<link>http://blog.mischel.com/2009/10/21/sniffing-network-traffic/</link>
		<comments>http://blog.mischel.com/2009/10/21/sniffing-network-traffic/#comments</comments>
		<pubDate>Thu, 22 Oct 2009 01:00:38 +0000</pubDate>
		<dc:creator>Jim</dc:creator>
				<category><![CDATA[Computers]]></category>
		<category><![CDATA[Internet]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://blog.mischel.com/?p=647</guid>
		<description><![CDATA[My latest crawler modifications require me to scrape Web pages that host videos so that I can obtain metadata (title, description, date posted, etc.) that we place in our index.  Unfortunately, there&#8217;s no standard way for sites to present such information.  ESPN and Vimeo have HTML &#60;meta&#62; tags that provide some info, but I have [...]]]></description>
			<content:encoded><![CDATA[<p>My latest crawler modifications require me to scrape Web pages that host videos so that I can obtain metadata (title, description, date posted, etc.) that we place in our index.  Unfortunately, there&#8217;s no standard way for sites to present such information.  ESPN and Vimeo have HTML &lt;meta&gt; tags that provide some info, but I have to go parsing through the body of the document to find the date.  (And yes, I&#8217;m aware that Vimeo has an API that will make this a moot point.  I&#8217;ll be investigating that soon.)</p>
<p>Other sites are much worse in that they provide <em>no</em> metadata in the HTML.  For example, one site&#8217;s video page is very code-heavy.  Requiring that the page be reloaded every time you request a new video would require a lot of network traffic.  Their design instead uses JavaScript to request a particular video&#8217;s metadata from a server.  Loading a new video involves downloading just a few kilobytes of data.</p>
<p>I spent some time this afternoon searching through the a video page HTML and the associated JavaScript, looking for the magic incantation that would get me the data I&#8217;m looking for.  The amount of code involved is staggering, and I quickly went crosseyed trying to decipher it before I hit on the idea of hooking up a sniffer to see if I could identify the HTTP request that gets the data.</p>
<p>It took me all of five minutes to download and install <a href="http://www.cleanersoft.com/sniffer/free_http_sniffer.htm">Free Http Sniffer</a>, request a video from the site in question, and locate the magic line in the 230 or so requests that the page makes when it loads.  Problem solved.  Now all I have to do is write code that&#8217;ll transform a video page url into a request for the metadata, and I&#8217;m set.</p>
<p>I have no idea why I didn&#8217;t think of the sniffer earlier.  I&#8217;d used one before for a similar purpose.  I suspect I&#8217;ll be making heavy use of it in the near future as I expand the number of sites that we crawl for media.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mischel.com/2009/10/21/sniffing-network-traffic/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Facebook is broke!</title>
		<link>http://blog.mischel.com/2009/09/01/facebook-is-broke/</link>
		<comments>http://blog.mischel.com/2009/09/01/facebook-is-broke/#comments</comments>
		<pubDate>Wed, 02 Sep 2009 00:20:09 +0000</pubDate>
		<dc:creator>Jim</dc:creator>
				<category><![CDATA[Internet]]></category>

		<guid isPermaLink="false">http://blog.mischel.com/?p=604</guid>
		<description><![CDATA[Checking Facebook tonight, I got a notification that I had hidden some applications from my news feed.  Thank you very much, Facebook, but I didn&#8217;t need reminding.
So I canceled the notification and clicked on something else.  It reminded me again.  Okay, so I added those applications back to my feed.  Won&#8217;t let a little bug [...]]]></description>
			<content:encoded><![CDATA[<p>Checking Facebook tonight, I got a notification that I had hidden some applications from my news feed.  Thank you very much, Facebook, but I <em></em>didn&#8217;t need reminding.</p>
<p>So I canceled the notification and clicked on something else.  It reminded me again.  Okay, so I added those applications back to my feed.  Won&#8217;t let a little bug stop me from posting a comment on a friend&#8217;s wall.</p>
<p>Except now I keep getting this:</p>
<p><img class="alignnone size-full wp-image-605" title="facebook" src="http://blog.mischel.com/wp-content/uploads/2009/09/facebook.jpg" alt="facebook" width="556" height="474" /></p>
<p>I can still post comments (the dialog is not modal), but it&#8217;s just &#8230; weird.</p>
<p><span style="color: #993300;">Later:  Closing the Facebook tab in my browser and re-loading fixed the problem.</span></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mischel.com/2009/09/01/facebook-is-broke/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Yahoo! Mail customer care stinks</title>
		<link>http://blog.mischel.com/2009/08/18/yahoo-mail-customer-care-stinks/</link>
		<comments>http://blog.mischel.com/2009/08/18/yahoo-mail-customer-care-stinks/#comments</comments>
		<pubDate>Tue, 18 Aug 2009 19:12:09 +0000</pubDate>
		<dc:creator>Jim</dc:creator>
				<category><![CDATA[Idiocy]]></category>
		<category><![CDATA[Internet]]></category>

		<guid isPermaLink="false">http://blog.mischel.com/?p=587</guid>
		<description><![CDATA[I am currently engaged in a week-long struggle with Yahoo! Mail&#8217;s &#8220;customer care&#8221; about an email that they&#8217;re blocking.  Since the beginning of the year, my server has been sending a daily report about crawler performance to me and to my coworkers.  The email consists of a single HTML file, inside of which are some [...]]]></description>
			<content:encoded><![CDATA[<p>I am currently engaged in a week-long struggle with Yahoo! Mail&#8217;s &#8220;customer care&#8221; about an email that they&#8217;re blocking.  Since the beginning of the year, my server has been sending a daily report about crawler performance to me and to my coworkers.  The email consists of a single HTML file, inside of which are some internal links and hundreds of text URLs, but no actual links to those URLs.  Our company&#8217;s email is hosted by Yahoo.</p>
<p>Until 10 days ago, that report was delivered daily, without fail.  But since we changed our colocation setup and got new IP addresses, the mail has been sporadic:  bouncing eight of the last ten times I&#8217;ve tried to send it.  The error message I get back from Yahoo&#8217;s server says that the message is rejected &#8220;for policy reasons.&#8221;  Digging deeper, I find that Yahoo&#8217;s filter seems to think that there are &#8220;links to potentially objectionable material or malicious software.&#8221;</p>
<p>When contacting Yahoo, I gave them considerable detail, including the text of the message, a full description of the problem, and the reason why I thought that their filter was being over zealous.  Their responses have been canned boilerplate paragraphs, first asking for information that is not relevant or that I&#8217;ve already supplied, then explaining the policy:  the same policy that&#8217;s on their Web site and that I told them I already understood.  I have yet to receive a response from Yahoo to indicate that they&#8217;ve actually read and understood <em>any</em> of the information that I&#8217;ve sent to them.  I&#8217;m convinced that if I sent a message requesting a ham and swiss on rye, they&#8217;d reply by asking me to forward the full headers from the email in question.</p>
<p>I would strongly discourage anybody from using Yahoo for their business email.  Their response to this simple request has convinced me that Yahoo&#8217;s incompetence is not limited to search (which they&#8217;ve finally agreed to farm out to Microsoft), but permeates the entire organization.  If you want reliable email and intelligent, <em>helpful</em> support, find somebody other than Yahoo to host it.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mischel.com/2009/08/18/yahoo-mail-customer-care-stinks/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
