š„ Importing my LinkedIn archive into Hugo
For years Iāve been publishing on LinkedIn. And for years Iāve had the same discomfort: itās my content, but it lives in someone elseās garden. That’s why initially I liked “blogging”… but that’s mainly fun if you have an audience (which LinkedIn gives you access to).
LinkedIn is increasingly a walled garden (and getting more aggressive about external links), so I wanted a way to keep my own words on my own website, without paying yet another platform subscription for the privilege.
So I wrote a tiny script to import my LinkedIn data export into this blog.
Why?
- To experiment with AI (and learn what itās actually good at, vs. what sounds good on Twitter).
- To dog-food a few features of my Hugo setup + the Adritian theme (topics, tags, related posts, search, and ārealā content volume).
- To keep my content in one place, on infrastructure I control, at essentially zero marginal cost - as I have been doing with my personal site already for a while, thanks to Hugo.
What LinkedIn gives you when you āDownload your dataā
LinkedIn lets you request an archive of your account data:
- Settings ā Data privacy ā āGet a copy of your dataā
- Wait a bit
- Download a file that is basically a pile of CSVs (and a few HTML files)
The archive is big. It includes things like connections, messages, reactions, comments, ads youāve seen, saved items, profile data⦠you name it.
For writing purposes, the important bits are:
Shares.csv: your posts/shares (the āfeed updateā kind of content).Articles/Articles/*.html: long-form LinkedIn articles as HTML files.Rich_Media.csv: metadata about media (but not an actual folder full of images you can just re-host).
What the script imports (and what it ignores)
The script lives at scripts/import_linkedin/import_linkedin_posts.go.
It imports two content types:
1) Shares (posts)
Source: Shares.csv
What gets imported:
Dateā HugodateShareCommentaryā post bodyShareLinkāoriginalURL(when present)
How it lands in Hugo:
- One Markdown file per post under
content/blog/ - Frontmatter includes tags like
linkedin,imported, andshare(so I can filter/group them later) - A slug based on the first line of the post
2) Articles (long-form posts)
Source: Articles/Articles/*.html
What gets imported:
<title>ā Hugotitle- A timestamp extracted from the filename (or from the HTML when possible)
- The
<body>converted to āgood enoughā Markdown-ish text (headings, lists, bold/italic, links, blockquotes)
Important limitation: the LinkedIn export does not give you the articleās canonical LinkedIn URL in an easy way, so those posts donāt get an originalURL automatically.
The āmissing imagesā problem (and why this is still worth doing)
Hereās the part that makes this importer imperfect: the LinkedIn download isnāt a neat āhere are your posts and here are the images you uploadedā.
In practice:
Shares.csvdoesnāt contain your post images as actual files.- Articles may reference images via external
media.licdn.comURLs, but youāre not getting a clean local asset bundle you can re-host.
So yes: if your content relies heavily on images, this approach has limits.
But itās still useful because the core value I want to preserve is the writing itself: the ideas, the posts, the timeline, the ability to search it, and the ability to link to it without depending on another productās whims.
If I ever want to āupgradeā a post, I can always manually add images later (and host them under static/ like any normal Hugo site).
How to use it
Download your LinkedIn archive and extract it somewhere.
Copy the relevant files into this repo under
scripts/linkedin/:Shares.csvArticles/Articles/(optional)
Run the importer:
go test ./scripts/import_linkedin
go run ./scripts/import_linkedin/import_linkedin_posts.go
- Review what got generated under
content/blog/(titles, slugs, formatting). - Run
hugo serveand sanity-check a couple of pages.
Itās intentionally boring: it creates posts, skips empty entries, and refuses to overwrite an existing file.
How it was built (and the ābreakthroughā moment)
I deliberately used this as an AI playground. These are the kind of projects that I like to use to experiment, without the “urgency and importance” of work-related tasks (which usually can’t stall for weeks or months).
During the process, I tried multiple AI agents, different workflows, and different models. And it was⦠humbling.
Copilot/Cursor with Sonnet 4.5, and ChatGPT 5.1 could get me started, but they consistently struggled once I threw the real export data at them. The LinkedIn CSV format is messy (quotes, embedded newlines, odd separators), and without a lot of hand-holding and detailed constraints, the output would look correct in a toy example and fall apart in practice.
Opus 4.5 was the one that got me to the breakthrough: a cleaner parsing strategy, better normalization rules, and the idea of locking down the tricky bits with tests (see scripts/import_linkedin/import_linkedin_posts_test.go).
The meta-lesson was the same as in many other areas: tools are only as good as the process around them. The āAIā part helped, but I still had to supervise, validate, and iterate.
Another thing where AI helped a lot was the boring-but-necessary work after the import: categorizing posts, suggesting tags/topics, and proposing a set of āreview tasksā (what should be draft, what should be kept, what should be reworded, etc.). I kept those suggestions as CSVs in the analysis/ folder of the repo, so I can batch-review them over time: github.com/zetxek/adrianmoreno.info/tree/main/analysis.
What comes next
Maybe nothing.
This might be one of those projects that ends with this article, and I forget about updating the archive again. Thatās a perfectly valid ending: I got the experiment, I got the content out, and I got a blog post out of it.
But I can also see a few alternate timelines:
- Maybe someone finds it useful and I open source the script (thatās literally how the Adritian theme started).
- Maybe I evolve it into a small utility you can point at your own LinkedIn export, so you can ādownloadā your content into whatever format you want (Hugo, Markdown, etc.), without reinventing the wheel.
- Maybe I improve the weakest part: images. Not by scraping LinkedIn, but by taking whatever image references the export already contains (or that articles embed), downloading whatās accessible, and rewriting posts to use locally-hosted assets.
If you want to adapt it for your own site, the main thing youāll probably change is the frontmatter mapping (categories/taxonomies) and how aggressive you want to be in HTML ā Markdown conversion.
And if LinkedIn ever decides to make exporting worse (wouldnāt surprise me), at least Iāve already extracted the important part: my writing āļø.
The markdown files contain my style and tone of voice - which I can use for other experiments, such as draft-creator AI agents that can give me drafts based on topics I find relevant to elaborate on.
(PS: this is how this article was “built” - AI drafted, and human edited)