If you’ve ever tried to watch a multi-part series of videos on YouTube, you’ve probably run into the problem of finding the next episode. If you list by title, you’ll often run into things like “Carving the Mushroom, Part 1” followed by parts 2, 3, 4, 6, 7, 9, 5, and 8, in that order. Why are 5 and 8 out of order? Because their titles have typos that cause them to be sorted that way. Whereas all the other parts are tagged as in “Part 1,” parts 5 and 8 are “Pt. 5” and “Pt. 8,” respectively, causing them to appear at the end of the list.
Another common mistake is changing the title. For example, most of the episodes will be “Carving the Mushroom,” but one or two of them will be “Carving the shroom,” which will result in those episodes being sorted out of order. Alphabetic listing absolutely depends on the video titles all having the same format.
YouTube is the most visible example of this episodic content poroblem, but I’ve seen similar errors on all types of episodic content. Fan fiction, amateur story sites, blogs, and podcasts are other examples that I’ve personally experienced problems with. I suspect that there are many others.
Commercial streaming services like Netflix and Prime Video apparently have standards for marking episodic content so that it can be ordered. In an ideal world, they’d all agree on one common standard. I do not know whether such a standard exists. There’s certainly no standard for YouTube videos! I’ve been wanting YouTube to provide some assistance in that area for 15 years. It should be easy for them to make a tool available that simplifies ensuring that videos in a series all have the same title, and all are tagged with the right season and episode. Or whatever ordering system the creator wants.
Is current AI technology up to that? Could it correctly satisfy the request, “List all the episodes of ‘Carving the stabby thing’, in order”? If an AI can do that, then it should pretty simple to create a tool that will identify individual series, order the episodes, and rewrite the titles to match the parameters that are input. Tell it how you want the episodes marked (i.e. “Season X Episode Y”, “Part X”, “Part X of Y”, etc.), and the AI could group the episodes, tag them, and rewrite the titles accordingly.
I’m confident that, given enough data and time to grind through it, I could create a machine-learning model that would largely solve the problem. It wouldn’t be perfect, but it’d be a whole lot better than the current anarchy. Even so, I think an AI solution would be faster to build and would do a better job, largely unassisted. Certainly the AI solution would be more robust than the ML solution.
How hard would it be to build such a thing? I just might have to find out.