Going serverless

vercel + next — things i learned on the way

If you’re hoping for a Vercel/Next.js bash that’s the wrong post for it. I’m pretty much ❤in it. Coming from more monolithic architecture I faced a few problems though from which some I didn’t expect. In the post I tried to collect all the issues I faced over the last few months and the solutions I used to work around them.

Database

In a monolithic app db connection time is neglect able as you’ll never notice it. It happens once t startup and then you’ll just reuse the pool. On serverless connecting to MongoDb for me was relatively slow(>1s) and a re-connection might happen on every call. Locally/In a monolith the connection creation is a lot faster for me which might be due to better collocation of database and app server — you got little control over that on vercel I guess.

While I didn’t find a 100% solution for this issue, atlas provides some docs about increasing the chance of connection reuse on lambdas which makes frequently used api endpoints faster. This only works for frequently used api endpoints though, as for the ones only called e.g. every 2 minutes there will never be a cache hit.

Another problem, probably more specific to mongodb is connection pooling. In mongodb a client instance opens up a connection pool, holding a certain amount of connections which is then shared for the mongo-client. On serverless though, sharing the connection between lambdas is not possible as each lambda runs in it’s own isolated environment. Each mongdb server has a connection maximum, which for mongdb atlas is 500 for the shared servers.
At a certain point I hit that limit as for working around the connection delay issue I didn’t close the connections automatically so stale connections + new connections where just to much for my db instance.

I worked around that issue in two ways:
1. I realized that some of the very minimal api endpoints i created could be combined as I always needed their data in combination anyways.
2. The default connection pool size is 5, but for most apis I didn’t run more than one or two queries/mutations and they usually run sequentially so there was no pointin a lot of connections so I could decrease the poolSize without any noticeable drawback.

Data fetching

Disclaimer: This might be a not so common usecase
I built some sort of website generator which gets injected with a remote config at build time. Therefore I wanted to load the config in _app getStaticProps and inject it into all pages. Sadly that feature does not exist yet.

I worked around this issue, but 1) fetching the config as a build step 2) serializing it as json and caching it in the file system 3) import it where needed

GetStaticPaths & GetStaticProps are incredible handy utilities to generate masses of static pages. Sadly there’s no built in way to share non route related data between the two. With a growing size of data this resulted in a quite unnecessary increase of build time as usually the workflow is sth like:
- GetStaticPaths fetches a list of x
- GetStaticProps fetches a single x

This seems to be a relatively common problem and I worked around it by writing the cache to a file system json.

Build

Coming from non-nextjs applications, you might be familiar with bundlesize/size-limit et al for monitoring you frontend bundle/page size-budgets. Nextjs does some clever things for bundle size chunking making it hard to reason about bundle size for a certain page.

As this wasn’t the biggest priority for now, I went the easy route by just having a budget for a selection of chunks. As vercel/next are incorporating a lot of dev/ops things right into the system I could imagine that at some point this is just a page level config feature you can opt in to.

Different behavior on dev vs prod

This is by far the most frustrating issue I ran into multiple times in various situations. The issue comes down to nextjs trying to hide the complexity from serverless, but in the end you might face parts of it eventually.
When running next dev you can essentially do what you want and where you want to, because after all — it’s a node app you’re running on your system.
When running things on prod though your pages will run in their own lambdas coming will all the up/downsides of these environments.

While chromium works on build, it will simply not work inside the deployed functions. That’s not only true for api routes, but also for getStaticProps revalidate so while your build in which you use chromium inside getStaticProps might work — within the re-validation things might fail.

I created workarounds for not using chromium where I did, but later found this issue, which might solve things as well.

Coming from non vm-like/stateful environments you might be used to caching things in memory or on the file system to speed up api responses. While in memory might work to some degree (like with the mongo connection trick), file system caches and similar most definitely wont.

Moved my cache to the database instead — a more clever way would probably been redis or similar, but i wanted to keep the tech stack small so it seemed more reasonable to just reuse the db i already got.

Inherent complexity from serverless

Not sure if vercel/next could do anything about these issues as these are more general problems you might face at serverless.

The whole serverless structure is event based at it’s core. There’s nothing like long-running or reoccurring jobs build in. Having this in mind I had to learn that most things I did with long running jobs or cronjobs can be done in an event based fashion. Still I always ended up needing cron jobs for some things. This is the case whenever I wanted to trigger some event not immediately in response to another event, but e.g. with a 15 minute delay. The same issue arises when you want to do something which is just “not event based”, like e.g. sending a notification email once a week.

Vercel recently added some docs for this and there is a community project targeting this in a more dx friendly way.
I started of using google cloud scheduler, but found it to expensive (10ct per job/month) for using it essentially as ping service. Also it was unnecessary hard to integrate in my dev/code setup. Eventually I ended up spinning up a long running app-engine app running agenda which I deployed alongside my nextjs app.

This punches essentially in the same hole than the previous point. Lambdas are nice for simple function execution, but as they are short running & stateless they are not suited for e.g. keeping websockets open for features that might require realtime updates(e.g. notifications or chat).

I started solving this by experimenting with mongodb realm and firebase realtime database, with the goal to just move away the realtime things from the next api to the client. In the end I — quite ironically — just used polling which simplified my client code quite a bit. I realized that in my case no-one cared about real-time and a polling delay of a few seconds just makes that requirement go away.

¹This seems to generally be a bigger problem than I expected as a lot of these “just deploy your app” kind of cloud service providers don’t support websockets (at least default app engine & cloud run don’t).

Parts of my data doesn’t come from my own services, but are provided by a 3th party which doesn’t allow setting up a callback hook, but only over websocket/polling. So I was facing the same issue as with cronjobs, this time I couldn’t offload the effort to the client/polling as I needed to first process that data in the database.

While there are SAAS which essentially allow you to transform websocket events to api callbacks, I didn’t want to use yet another 3rd party service, so I just added a service which polls events on my app-engine server and writes them to the db.

I hope this article provided you with some workarounds for issues you might face. My overall learning from these serverless experiments was that “going full serverless” might be an unrealistic/undesirable goal. While it’s certainly possible to archive, it comes to the price of a Frankenstein architecture stitching x services together just to solve problems that wouldn’t be problems otherwise.

I’m overall happy with the setup I ended up with (although I’m not sure if i’d take mongodb as a db again):
- a serverless next.js core serving the frontend & isolated api endpoints
- a server application doing the more complex/stateful tasks

Written by

doing node.js stuff

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store