Host It Yourself: Stop embedding external content on your site

Embedded content is bad for visitors’ privacy and long-term content availability and archiving. “Sorry, we can’t display this document”, “Video deleted by owner”, “The image you are requesting doesn’t exist or is no longer available.” “410 Gone” and “404 Not Found”. Does this sound like your website?

“This video has been removed by the user. Sorry about that.” – YouTube message

“This video has been removed by the user. Sorry about that.” An apology from YouTube.

Those images, videos, and PDFs that are embedded into pages and articles may be gone in a month or a year. If your site isn’t hosting the resources itself, it has no real control over the resources’ long-term availability.

This is a plea to website owners and bloggers: Please host your content!

Availability

When visiting old pages—in Internet terms that mean anything published over one year ago—I often have no idea what the page is about. The page refers to a document, a video, a snippet of code, or something else that no longer exists. Whoever uploaded one or more of the referenced resources on the page has deleted it. Other times it’s because a service that was the pinnacle of modern hosting and presentation at the time has since gone out of business. More often, some automated battleship of a bot fighting copyright-infringement has mistakenly aligned its legal canons on your content.

Services change. The way it looks now isn’t how it may look in the future. Maybe you won’t agree with how embedded videos from YouTube will be presented in nine months. You are likely to have absolutely nothing to say on the matter.

The only way to ensure consistent and continued availability is to host all content on your servers. By doing so, you can ensure that some of the value is maintained over time. A back catalog of good quality content is increasingly important in these search-driven times.

Privacy and who agreed to what?

What are the scripts you embed from third-parties doing now? —and how about in six months? YouTube’s embedding code alone is 160 KB of scripts with full scripting access to your website.

Google’s main privacy policy (including YouTube) alone is 2500 words. That is a whole lot to implicit consent on behalf of your visitors. They aren’t visiting YouTube directly, so how can they be expected to be aware of and agree to their privacy policies? What information is that one embedded message from Twitter or code snippet from GitHub collecting about your visitors? Can you fully answer that question if one of your visitors were to ask you? Does your site’s own privacy policy acknowledge what other policies the user should be aware of for all the third-party scripts and services?

At some point, it does become easier and more manageable with self-hosted content. You are in control of all the content now, and five years from now.

“Document deleted by owner. Sorry, we can’t display this document.” Apology from Scribd.

“Document deleted by the owner. Sorry, we can’t display this document.” Apology from Scribd.

Update (): The General Data Protection Regulation (GDPR) went into effect in the European Union (EU) and the European Economic Area (EEA) earlier . The new regulation mandate that websites pay attention to privacy as discussed in this article. I believe this is a positive development! Read some of my newer articles on the GDPR for more information.