Quality, Technical Article

Setting up SonarQube in AWS for code analysis

January 16, 2017
Bill Hodghead

In looking around the web, I don’t see much on how to get SonarQube working in AWS. This blog will cover how to setup SonarQube in AWS for simple code analytics.

Background

SonarQube is an excellent open source code analysis server that you host yourself. Code analysis and central reporting — measuring complexity, dependencies, coverage, etc. — is excellent. If you are in a Microsoft stack you can use TFS to get these numbers, but for open source, across multiple languages, tools are more limited, and SonarQube is still one of the better ones.

You can host SonarQube on your own machines and show the reports from there, but if you are doing everything in the cloud or have a distributed team, it’s a little trickier. This blog will cover that case.

Goals and requirements

I need code analysis that shows code complexity, common bugs, and supports multiple languages like Java and NodeJS.
I need it to work with many repos and users. There are tools that charge by repo or user, but costs would quickly escalate for those.
I want to use free services as much as possible. My code is not open source, so I can’t take advantage of a lot of the services that are free to open source.
I’m using GitHub private repositories and CircleCI for continuous build and integration.

These are my goals. You may not need all of them, but if you want to make code quality part of your build and deployment process SonarQube in AWS is a reasonable way to go.

Procedure

Get SonarQube running with its built-in database

Create your AWS instance

I went with the single Amazon medium instance Linux 64 bit. The docs say you might need up to a large – it will depend on how many users you have. 2GB of RAM is the minimum.

Linux + SonarQube doesn’t have a large disk footprint, a little over a gigabyte, so the volume doesn’t need to be significant. Include room for a copy of your code. I started with an 8G volume, which was overkill.

Open up a custom port TCP:9000 for access to the SonarQube site. That and your ssh:20 port is all you’ll need.

Download and unzip SonarQube and the SonarQube Scanner. SonarQube installation is here. Scanner installation is here .

In my case, I just downloaded and unzipped the files on my Windows desktop then copied them to the AWS machine using WinSCP. SonarQube suggests putting the server in / etc., which may require an extra step.

Upgrade Java in AWS to one that works with SonarQube

Install Java and change the Java version

sudo yum install java-1.8.0
sudo alternatives --config java

Start the service and test it

cd /etc/sonarqube-5.6.3/bin/linux-x86-64
sudo ./sonar.sh start

With a browser connect to HTTP://<my_aws_ip_or_dns_name>:9000 to verify that SonarQube is running.

Switch SonarQube to a Postgres backend

SonarQube shouldn’t be run in production with the default DB it comes with, so let’s now move it to a Postgres DB. You can use anything, but a local Postgres instance is free.

Ideally, you’d put your Postgres data on a separate volume for easy management, but because I don’t care about the historical data and will just recreate the DB if I need to restore it, I left the DB on the primary volume

To install Postgres in AWS, you could follow this guide. It recommends that you create the extra volume for the Postgres data if you want. That’s a good practice, in general, but since this data can be rebuilt with a scan, I didn’t bother and just did a quick and dirty local installation.

For a quick Postgres installation, just do this:

sudo yum install postgresql9-server

Set the Postgres authentication and user/password

sudo -u postgres psql postgres # set the default password for Postgres

now also (still as the Postgres account, sudo su – postgres if you’re not, exit later)

cd /var/lib/pgsql9/data

edit pg_hba.conf to change peer to md5 as per these instructions

Create the Empty DB and User

From the docs: Create an empty schema and a SonarQube user. Grant this SonarQube user permission to create, update, and delete objects for this schema. The charset of the database has to be set to “UTF-8” and the language (database and user) to “English.”

su – postgres
psql
CREATE USER myuser WITH PASSWORD 'myPassword';
CREATE DATABASE sonarqube ;
GRANT ALL PRIVILEGES ON DATABASE sonarqube to myuser;
quit

Startup the Postgres service

sudo service postgresql initdb #this is a one-time operation

sudo service postgresql start

Configure SonarQube to use Postgres

Edit the /conf/sonar.properties file as described in the SonarQube installation docs.

Now restart SonarQube

cd /etc/sonarqube-5.6.3/bin/linux-x86-64

sudo ./sonar.sh stop
sudo ./sonar.sh start

and test again in the browser

Add some minimal security

Hey, this is your source code. You probably don’t want to expose it on an HTTP connection unless you are open source.

Log in as the SonarCube admin and change the admin password
Create a user account to scan code (you’ll use it below)
Create other user accounts for your users

Add your code

I found the easiest way to setup GitHub was just to copy my local GitHub repo over to the AWS server using WinSCP.

Try a git pull on the AWS server to see if you can sync the code.

Setup the sonar-project.properties files to scan the code following the guidance here. You could have one sonar-project.properties file for all the repos or a separate file for each. I went with a separate file for each because that way I can quickly run a scan on just one repo and later I can create the on-demand service for a single repo.

In the sonar-project.properties file for each repo add:

sonar.login=theScanAccount
sonar.password=thePassword

so the scan process can connect to SonarQube

You’ll also have to edit the standard metadata. Example:

# Required metadata
sonar.projectKey=myproject
sonar.projectName=myproject
sonar.projectVersion=1.0

# Comma-separated paths to directories with sources (required)
sonar.sources=src

# Language
sonar.language=js

Now try a scan using

/etc/sonar-scanner-2.8/bin/sonar-scanner

I put the scanner in /etc. You might have put it in /usr or /home adjust the path accordingly.

Now check out the SonarQube web UI to see your data. This is probably a good time to configure your rules. The code quality rules that SonarQube starts with may not match your code style guidelines, so it’s good to get those in sync before you show this to your team.

Adding code coverage to SonarQube

If you have SonarQube in AWS, you probably run your tests there or in some cloud testing tool. I’m using CircleCI for the build, so the test and coverage files are on their servers.

Run an Amazon Web Services (AWS) config to add the key, secret key, and region to your local config files. Region should be the full name such as “us-west-1”.

In my case, I’m using CircleCI, so at the end of my build I have a code coverage file, lcov.info, on the CircleICI server, which I need to get to my SonarQube server in AWS. To do this, I copy the coverage file to an AWS bucket and use a script to copy it from the bucket when I run the SonarQube scan.

In the CircleCI circle.yml

test:
override:
[run the tests and generate the lcov.info file, then]

aws s3 cp ./coverage/lcov.info s3://your-aws-bucket/sonar/your-repo/

Automating

I’m going to tell you how to set this up for a scheduled build, rather than an on-demand continuous build. You can do an on-demand build, but it requires a minimal rest service on the server (and another open port). I’m talk about that later.

Create an automation script

Create a script to automate the scans. I called mine pullAll.sh. Here’s some code to do a git pull each repo, copy the coverage file from an Amazon bucket, and finally run the SonarQube scan.

function pullRepo {
echo " "
echo "entering repo "$1
cd ../$1
pwd
git checkout master
git pull

aws s3 sync s3://your-aws-bucket-here/sonar/$1/ .

if [ -f "lcov.info" ]; then #copy over the coverage data
sed 's/home\/ubuntu/data/g' lcov.info > lcov.info.txt
mv -f lcov.info.txt lcov.info
fi

if [ -f "sonar-project.properties" ]; then #scan
/etc/sonar-scanner-2.8/bin/sonar-scanner
else
echo "skipping scan for " $1
fi
}

cd /data/your-repo

pullRepo your-repo
pullRepo next-next-repo

#etc.

Sure, I could’ve written some fancy script that went into every directory, but this is simple to maintain and easy to control if you want to just scan some repos.

Schedule the scans

To update the scan on a regular basis, create a simple sh script in /etc/cron.daily (or whatever frequency you want). It should contain something like this:

/usr/bin/pullAll.sh > /home/ec2-user/pullAll.log

Implementing on-demand build

I didn’t do this part, so I’ve left it as an exercise to the reader. If you do it, send me the code!

All you need is a small restful web service that runs on the SonarQube machine. If you call a single API “scanRepo” with a parameter of the repo name it will run the script to pull and scan that repo. It’s probably about 10 lines in NodeJS.