Secured Jupyter Notebook on Amazon Linux

My last post was about running Jupyter remotely and using SSH tunnel for connections. That appeared to be inconvenient: too many steps for launching Jupyter and then connecting to it. I read through documentation of Notebook website and they have pretty detailed instructions how to run a public server.

I’m using AWS EC2 c4 instance with Amazon Linux. In general there are two steps: making server public and then securing it.

First you have to generate a configuration file:

jupyter notebook --generate-config

Then generate SHA1 password hash for your login by running Python command prompt:

python
 >> from notebook.auth import passwd
 >> passwd()

Then update your configuration file /home/user/.jupyter/jupyter_notebook_config.py by adding these settings to the end:

c.NotebookApp.ip='*'
c.NotebookApp.password=u'sha1:<your hashed password here>
c.NotebookApp.open_browser=False
c.NotebookApp.port=9999

Now you can run jupyter notebook and access your server using public IP or DNS name. But it’s better to secure your connection with SSL/TLS. And for that you have to generate SSL certificate and key. I will describe my case where I registered a DNS record A for my hostname and then used Let’s Encrypt to generate a valid HTTPS certificate.

The first step is obviously registering your DNS hostname which is out of scope. (With AWS Route 53 it is super easy though)

Then you have to configure your firewall to accept connections on port 443 (you can remove that later). In AWS you need to update security group for your instance and create a rule for port HTTPS.

Next step is downloading a tool from Lets’Encrypt:

wget https://dl.eff.org/certbot-auto
chmod a+x certbot-auto

That tool does all the job for creating keys, certificate and then signing it. That’s why it requires to have port 443 open: it’s going to check that you actually own the domain. by connecting to it from outside server. Don’t be scared by the amount of packages it’s going to install during the first run.
So the command is:

sudo ./certbot-auto certonly --standalone --debug -d <your domain>

When it finishes you will get bunch of files in /etc/letsencrypt directory. But you need files from /etc/letsencrypt/live/<you domain> folder. My problem was that these files are symlinks to ../archive and ec2-user can’t read them. So I had to change permissions:

sudo chmod +x /etc/letsencrypt/archive/
sudo chmod +r /etc/letsencrypt/archive/*

After that we can specify our key and certificate in Notebook config file:

c.NotebookApp.certfile=u'/etc/letsencrypt/live/<domain>/fullchain.pem'
c.NotebookApp.keyfile=u'/etc/letsencrypt/live/<domain>/privkey.pem'

Now your Notebook can be re-started and you must use HTTPS protocol for your connection: https://<domain>:9999/

My Jupyter Notebook also starts during the boot sequence. In Amazon Linux you can use /etc/rc.d/rc.local file for that by adding these command there:

jupyter notebook --config path_to_your_config > /var/log/jupyter-notebook.log &2>1 &