Host It Yourself: Stop embedding external content on your site

Embedded content is bad for visitors’ privacy and for long‐term content availability and archiving. “Sorry, we cannot display this document”, “Video deleted by owner”, “The image you are requesting does not exist or is no longer available.” “410 Gone” and “404 Not Found”. Does this sound like your web site?

“This video has been removed by the user. Sorry about that.” Apology from YouTube.

“This video has been removed by the user. Sorry about that.” Apology from YouTube.

Those images, videos, and PDFs that are embedded into pages and articles may be gone in a month or a year. If your site is not hosting the resources itself, it has no real control over the resources’ long‐term availability.

This is a plea to website owners and bloggers: Please host your own content!

Availability

When visiting old pages—in Internet terms that means anything published over one year ago—I often have no idea what the page is about. The page refers to a document, a video, a snippet of code, or something else that no longer exists. Whoever uploaded one or more of the referenced resources on the page have deleted it. Other times it is because a service that was the pinnacle of modern hosting and presentation at the time has since gone out of business. More often, some automated battleship of a bot fighting copyright‐infringement have mistakenly aligned its legal canons on your content.

Services change. The way it looks now is not how it may look in the future. Maybe you will not agree to how embedded videos from YouTube will be presented in nine months. You are likely to have absolutely nothing to say on the matter.

The only way to ensure consistent and continued availability is to host all content on your own servers. By doing so, you can ensure that some of the value is maintained over time. A back catalog of good quality content is increasingly important in these search‐driven times.

Privacy and who agreed to what?

What are the scripts you embed from third‐parties doing now? —and how about in six months? YouTube’s embedding code alone is 160 KB of scripts will full access to your site.

Google’s main privacy policy (including YouTube) alone is 2500 words. That is a whole lot to implicit consent on behalf of your visitors. They are not visiting YouTube directly, so how can they be expected to be aware of and agreeing to their privacy policies? What information is that one embedded message from Twitter or code snippet from GitHub collecting about your visitors? Can you fully answer that question if one of your visitors were to ask you? Does your site’s own privacy policy acknowledge what other policies the user should be aware of for all the third‐party scripts and services?

At some point, it does become easier and more manageable with self‐hosted content. You are in control of all the content now, and in five years from now.

“Document deleted by owner. Sorry, we can’t display this document.” Apology from Scribd.

“Document deleted by owner. Sorry, we can’t display this document.” Apology from Scribd.