networkandcode

2023-04-08T02:37:32+00:00

This post first appeared on dev.to

Hey :wave:, here is my first post after the community builder renewal :satisfied:. In this post, we would send some MQTT messages in line protocol format to AWS IoT, which would then get ingested to InfluxDB via a native MQTT subscription.

InfluxDB usually relies on systems such as Telegraf or some client programs to send metrics to it, however here with the native subscription, we don’t need the intermediate system.

As a prerequisite, follow this post to understand how to publish messages from a Python based MQTT client to the MQTT endpoint in AWS.

We would be making change to the publisher code, as InfluxDB looks for messages in a different format called as the line protocol.

Refer to this link for more information on the lineprotocol.

A sample message in our case would be like temp_sensor,id=client-d36a601f-0b0e-4ba2-9d54-45feec236deb,room=lobby temp=30 1668930716251915510

Where temp_sensor is the measurement name, the tags are id and room, the field is temp. Note the timestamp above is in unix timestamp format in nanoseconds. Note that the tags & timestamp are optional in line protocol, and you can have more than one filed. The timestamp will default to the current time if not provided.

Our publisher code would look like below.

nc:~/environment/iot $ cat publish-line-protocol.py 
# import vars from connect.py
from connect import args, client_id, mqtt_connection

from awscrt import mqtt
import json, random, time

while True:
    rooms = [ 'bed-room', 'hall', 'kitchen', 'living-room', 'lobby' ]
    room = random.choice(rooms)
    
    # set random temperature between 24 and 32 (this is celsius range)
    temp = random.randrange(24, 32)
    
    # set timestamp in  nanoseconds
    now = time.time_ns()
    
    # form the message
    message = f'temp_sensor,id={client_id},room={room} temp={temp} {now}'
    
    # publish the  message
    mqtt_connection.publish(
        topic=args.topic,
        payload= message,
        qos=mqtt.QoS.AT_LEAST_ONCE
    )
    print(f'Message published: {message}')
    time.sleep(1)

Let’s run the code.

nc:~/environment/iot $ python publish-line-protocol.py --ep $IOT_DEV_EP --pubcert pub-cert.pem --pvtkey pvt-key.pem --cacert ca-cert.pem 
--topic temperature
client-7219bca1-8c06-41c8-b69c-783c3ee85a41 is connected!
Message published: temp_sensor,id=client-7219bca1-8c06-41c8-b69c-783c3ee85a41,room=hall temp=29 1668932805475181112
Message published: temp_sensor,id=client-7219bca1-8c06-41c8-b69c-783c3ee85a41,room=hall temp=24 1668932806476308869
Message published: temp_sensor,id=client-7219bca1-8c06-41c8-b69c-783c3ee85a41,room=lobby temp=24 1668932807477465026

Press Ctrl C when you want to stop the code.

Now, let’s launch InfluxDB, you may get a cloud subscription from the AWS marketplace, or directly from the influxdata portal.

On InfluxDB, Goto Load Data > Native Subscriptions and create a new suscription. Set the subscription name as temp_sensor. Enter some description like Messages from AWS MQTT. Go to Security details > certificate and copy/paste the ca certificate, private key and public certificate downloaded from AWS while creating the thing in IoT. Set the topic as temperature because that’s where we were publishing the messages to. And then set the write destination to a bucket, which can be created if it doesn’t exist yet.

That’s it, save the subscription.

We can now visit the data explorer, and see the graph for the data sent. If we hover over the graph, we should be able to see the values for each of the tags(rooms) that have different colors.

This way we can send MQTT metrics to InfluxDB with subscriptions and visualize those. A couple of things before finishing, you may stop the subscription, when you don’t want to keep it running,

and there is a notifications link on the subscription that should help you with logs for troubleshooting errors.

Thank you for reading !!!

Query HarperDB’s REST API via Apollo GraphQL

2023-02-17T00:00:00+00:00

**This post first appeared on dev.to

Introduction

Hi there :wave:, in this post, we shall launch an Apollo server with node, and perform some read & write operations from the Apollo studio sandbox to a HarperDB instance. Having some graphql fundamentals would be beneficial to understand better, though some parts of the code would be explained.

Studio

I have an account in HarperDB studio and have setup an organisation and a free trier instance there. Once logged in to the instance, the config tab would give the instance credentials we would need to connect from our code.

Note that the studio credentials(email/password) are different from the instance credentials(username/password). You can have more than one instance in the studio. You can have only one free tier instance though across organisations.

Copy the instance url and the basic token from the config shown in the image above, as we would need those while sending API calls.

Clone

Clone the code from GitHub.

$ git clone git@github.com:networkandcode/fruits-apollo-hdb.git

Dependencies

The dependencies for the code are as follows.

  "dependencies": {
    "@apollo/datasource-rest": "^5.0.2",
    "@apollo/server": "^4.3.3",
    "dotenv": "^16.0.3",
    "graphql": "^16.6.0"
  }

We can install these.

$ npm i

About the packages above:

@apollo/datasource-rest is used to make rest api calls from apollo to endpoints such as HarperDB’s instance url.
@apollo/server is used to launch the apollo server itself.
dotenv is used to read env vars.
graphql is the core graphql library.

You may refer to this guide for more info.

Variable

Add an .env file, we just need one variable.

$ cat .env
HDB_INSTANCE_URL=https://<instance-name>.harperdbcloud.com

Start

There is a start script with invokes the index file.

$ cat package.json| grep start
    "start": "node index.js"

Good to start the server

$ npm start

> server@1.0.0 start
> node index.js

🚀  Server ready at: http://localhost:4000/

Apollo studio

Once the server is started we should be able to see the sandbox studio on the browser at http://localhost:4000/.

Token

Let’s set the authorisation header, you may paste the basic token obtained from HarperDB studio here.

We are setting this token in the server context.

    const token = req.headers.authorization;

Which would be passed to HarperDB while the REST API call is being made.

    willSendRequest(_path, request) {
      request.headers['authorization'] = this.token;
    };

Code

Let’s see what code files we have and their purpose.

$ ls *.js
datasource.js   index.js        resolvers.js    schema.js

datasource.js is where we are making calls to HarperDB via REST API.
index.js has code that wraps the schema and resolvers in the server and starts it.
resolvers.js contains resolvers that resolves all of the queries and mutations defined in schema.js by sending calls to HarperDB API defined in datasource.js
schema.js contains our graphql schema that has built in types Query and Mutation as well as other user defined types and inputs.

Schema

Our graphql schema is a collection of type definitions. And hence it’s common to use typeDefs as our schema variable.

$ cat schema.js 
const typeDefs = `
  input CreateTableBody {
    schema: String!
    table: String!
    hash_attribute: String!
  }

  input FruitInput {
    name: String!
    "calories per 100 gm"
    calories: Int!
  }

  type Query {
    fruits(schema: String!, table: String!): [Fruit]
    fruit(schema: String!, table: String!, name: String!): [Fruit]
  }

  type Mutation {
    createSchema(schema: String!): ApiResponse!
    createTable(body: CreateTableBody!): ApiResponse!
    insertRecords(schema: String!, table: String!, records: [FruitInput!]! 
): ApiResponse!
  }

  type Fruit {
    id: ID!
    name: String!
    calories: Int!
  }

  type ApiResponse {
    status: Int!
    message: String!
  }
`;

export default typeDefs;

In the schema, there are two built in types Query and Mutation. Query is meant for read operations where as Mutation refers to operations that involve create, update or deletion of data.

In Query, we have defined two fields fruits and fruit. fruits takes two arguments schema and table, both are strings and are mandatory or non nullable as denoted with !. Both fruits and fruit fields return same type of data denoted by [Fruit], this indicates an array of records, where each record is of type Fruit, this is not a built in type like String or Int, hence we have defined it as type Fruit. The type definition for Fruit denotes it’s structure, it would have three fields id, name and calories which are of types ID, String and Int respectively, and all 3 fields are non nullable. To summarise, we can have two Query operations one is fruits and the other is fruit both return a list of fruits(records from HarperDB).

Like wise for Mutation, we have three fields createSchema, createTable and insertRecords with their respective arguments, and all of them would return the same type of response as denoted by ApiResponse, that would have only 2 fields status and message.

Note that when a field type is not built in, we would define it with type, like we have defined type Fruit and type ApiResponse in our schema. Likewise, when an argument type is not built in, we would delete it with input, like we have defined input CreateTableBody and FruitInput in our schema.

Datasource

Here, we create a new class by name HdpApi that inherits from RESTDataSource. Note that we obtain the baseURL from the env var.

The willSendRequest function is used to add the header while calls are being sent to the base url. There are just two functions we are defining here, as in HarperDB all the calls are of type POST, and there mainly two types of calls, one that sends SQL style statements and the other which they call as NoSQL operations.

$ cat datasource.js 
import { RESTDataSource } from '@apollo/datasource-rest'; 

class HdbApi extends RESTDataSource {
    baseURL = process.env.HDB_INSTANCE_URL;
    
    constructor(options) {
      super(options);
      this.token = options.token;
    };
  
    async sqlQuery(body) {
      // for read operations
      return await this.post(
        '',
        {
          body,
        }
      ).then((res) => {
        return res;
      })
    };
  
    async noSqlQuery(body) {
      return await this.post(
        '',
        {
          body,
        }
      ).then((res) => {
        const { message } = res;
        return {
          status: 200,
          message
        }
      }).catch((err) => {
        const { status, body } = err.extensions.response;
        const message = body.error;
  
        return {
          status,
          message
        }
      });
    };
  
    willSendRequest(_path, request) {
      request.headers['authorization'] = this.token;
    };
  
  };

  export default HdbApi;

Index

In the index file, we import the modules first. And then, we import the schema, resolvers and our datasource. We use dotenv to read the var defines in .env.

We then create an Apollo server instance that takes two arguments: schema and resolvers.

We start the server finally, and we give options such as the listening port, and the context.

$ cat index.js 
import { ApolloServer } from '@apollo/server';
import { startStandaloneServer } from '@apollo/server/standalone';
import dotenv from 'dotenv';

import typeDefs from './schema.js';
import resolvers from './resolvers.js';
import HdbApi from './datasource.js';

dotenv.config();
console.log(process.env)

const server = new ApolloServer({
  typeDefs,
  resolvers,
});

const { url } = await startStandaloneServer(server, {
  listen: { port: 4000 },
  context: async({ req }) => {
    const { cache } = server;
    const token = req.headers.authorization;
    return {
      dataSources: {
        hdbApi: new HdbApi({ cache, token }),
      }
    }
  }
});

console.log(`🚀  Server ready at: ${url}`);

The context returns a dataSources dictionary that has a hdpApi key in it, and this points to a new HdpApi object which is defined in the datasource.

We can refer to this context in the resolvers.

Resolvers

There are 2 main resolvers for the built in types Query and Mutation. Inside each resolver, we have a function for each field inside the type. In our schema we have defines two types fruits and fruit, hence there are two such functions inside the Query resolver too. Similarly we have three functions for the Mutations: createSchema, createTable and insertRecords.

What these functions return will come as the result in the Apollo Studio for our GraphQL operation. The return statements are referring to contextValue which is the context function we defined in index. So, the resolvers are able to use the context defined in the server, which subsequently points to the api calls we defined in the datasource.

Note that there are four arguments for each function: parent, args, contextValue and info. We would only need the 2nd and 3rd args for this exercise though. We have already seen about contextValue above, and args refer to the variables we set while running an operation in the Apollo studio.

$ cat resolvers.js 
const resolvers = {
    Query: {
      fruits: (_, args, contextValue) => {
        const { schema, table } = args;
        const body = {
          "operation": "sql",
          "sql": `SELECT * FROM ${schema}.${table}`,
        }
  
        return contextValue.dataSources.hdbApi.sqlQuery(body);
      },
      fruit: (_, args, contextValue) => {
        const { schema, table, name } = args;
        const body = {
          "operation": "sql",
          "sql": `SELECT * FROM ${schema}.${table} where name = "${name}"`
        };
  
        return contextValue.dataSources.hdbApi.sqlQuery(body);
      },
    },
    Mutation: {
      // There are four args parent, args, contextValue and info
      createSchema: (_, { schema }, contextValue) => {
        const body = {
          operation: "create_schema",
          schema,
        };
  
        return contextValue.dataSources.hdbApi.noSqlQuery(body);
      },
      createTable: (parent, { body }, contextValue, info) => {
        body = {
          ...body,
          operation: "create_table",
        }
  
        return contextValue.dataSources.hdbApi.noSqlQuery(body);
      },
      insertRecords: (parent, { schema, table, records }, contextValue, 
info) => {
        const body = {
          operation: "insert",
          schema,
          table,
          records
        }
  
        return contextValue.dataSources.hdbApi.noSqlQuery(body);
      }
    }
  };

  export default resolvers;

Operations

Ok so all the setup is done. We are good to go with the Query / Mutation operations. Let’s start with Mutations.

Create schema

Let’s try creating a schema in HarperDB with the mutation in Apollo GraphQl. Make a note that this schema is not the GraphQL schema, it’s the schema in HarperDB.

Building the operation and variables is easy, you just need to click on the plus sign next to the operation and field.

We can validate this in the browse section of HarperDB studio.

If we try to run the mutation again on Apollo studio, we should see an error message that says the schema already exists.

Create table

The schema is created, we can now try to add a table in this schema.

The operation is ready however we need to fill the data for the body variable, for which we can click on the argument.

We should be able to see all the input data fields defined for the argument. We can add them all with the plus button.

The variable should be populated for us with null fields.

We can replace null with the actual values.

We can run the operation and see the response.

If you try again, you should see an error message.

Add entries to table

The operation in this case would be:

mutation Mutation($schema: String!, $table: String!, $records: 
[FruitInput!]!) {
  insertRecords(schema: $schema, table: $table, records: $records) {
    message
    status
  }
}

The variables would be:

  "schema": "myschema",
  "table": "fruits",
  "records": [
    {
      "name": "Apple",
      "calories": 52
    },
    {
      "name": "Banana",
      "calories": 89
    }
  ]
}

And the response as follows, after running the mutation:

{
  "data": {
    "insertRecords": {
      "message": "inserted 2 of 2 records",
      "status": 200
    }
  }
}

Retrieve records

Let’s try a query operation this time, for retrieving all records from the table.

The operation is:

query Fruits($schema: String!, $table: String!) {
  fruits(schema: $schema, table: $table) {
    calories
    id
    name
  }
}

The variables are:

{
  "schema": "myschema",
  "table": "fruits"
}

And the result of the query operation would be:

{
  "data": {
    "fruits": [
      {
        "calories": 52,
        "id": "3963a29c-d696-44ae-a3d8-e11de48bf339",
        "name": "Apple"
      },
      {
        "calories": 89,
        "id": "eec06bc5-85cf-48eb-87d2-89a25fd9f3c1",
        "name": "Banana"
      }
    ]
  }
}

Retrieve selective records

And one final query to retrieve selective records. The query is:

query Query($schema: String!, $table: String!, $name: String!) {
  fruit(schema: $schema, table: $table, name: $name) {
    calories
    id
    name
  }
}

Variables:

{
  "schema": "myschema",
  "table": "fruits",
  "name": "Apple"
}

And result of the query:

{
  "data": {
    "fruit": [
      {
        "calories": 52,
        "id": "3963a29c-d696-44ae-a3d8-e11de48bf339",
        "name": "Apple"
      }
    ]
  }
}

Summary

Thus we were able to access the Apollo studio on the browser and send queries and mutations to HarperDB. HarperDB is considering GraphQL functionality on their roadmap. Check out their Feedback Board and give it an upvote if you think it would be helpful!

Feel free to ask questions if the flow is not clear, and correct if there are any mistakes. Thank you for reading !!!

Image credit: unsplash

HarperDB Helm chart on Artifact Hub

2023-02-03T00:00:00+00:00

This post first appeared on dev.to

Introduction

Hey :wave:, in this post, we shall see how to create a helm chart for HarperDB based on the boilerplate helm chart with the helm cli, lint/dry run it and push it to the artifact hub, and then reuse it to install a helm release on a Kubernetes cluster. You can get the helm chart used in this post from this link.

You may check this post if you are looking to install harperdb with a custom, minimal helm chart.

Search

As of this writing, there is no chart available on artifact hub for harperdb, the screenshot below should validate that.

So our goal is to push the harpderdb chart to artifact hub, so that the search result shows an entry.

Let’s give a try on ChatGPT.

We could also search from the helm cli, to see if there exists a chart for harperdb. For which you have to first install helm in your system.

On Mac, it could be installed as follows.

$ brew install helm

Now, we can search for the chart.

$ helm search hub harperdb
No results found

This result matches with the search we did on website.

Chart

Ok, we can create our chart. Let’s first create a boilerplate chart with the name harperdb.

$ helm create harperdb
Creating harperdb

A chart is created for us, it’s nothing but a directory with a specific layout.

$ ls -R harperdb  
Chart.yaml	charts		templates	values.yaml

harperdb/charts:

harperdb/templates:
NOTES.txt		deployment.yaml		ingress.yaml		serviceaccount.yaml
_helpers.tpl		hpa.yaml		service.yaml		tests

harperdb/templates/tests:
test-connection.yaml

Let’s make a few changes.

Change the appVersion, I am going to use the latest version of harperdb found in the tags section at docker hub.

$ cat harperdb/Chart.yaml| grep appVersion
appVersion: "1.16.0"

$ sed -i 's/appVersion: "1.16.0"/appVersion: "4.0.4"/g' harperdb/Chart.yaml

$ cat harperdb/Chart.yaml| grep appVersion
appVersion: "4.0.4"

We don’t need any sub charts for now, so we can remove that directory.

$ rm  -r harperdb/charts

We can also remove the tests directory.

$ rm -r harperdb/templates/tests

Values

We need to make a few modifications in the values file.

Image

Let’s set the image.

$ cat harperdb/templates/deployment.yaml | grep image:
          image: ":"

The tag can come from the appVersion, we just need to set the image repository in values. By default, it would have nginx.

$ grep repository: harperdb/values.yaml 
  repository: nginx

Edit values by replacing nginx with harperdb/harperdb.

$ sed -i 's#repository: nginx#repository: harperdb/harperdb#g' harperdb/values.yaml

$ grep repository: harperdb/values.yaml 
  repository: harperdb/harperdb

Service

HarperDB uses the port 9925 for the rest API, we would be exposing only this here, though there are other ports like 9926, 9932 for custom functions, clustering etc.

In our chart they are setting the service port in .Values.service.port and the same port is used as port the container port too, we can stick with that for simplicity.

$ grep -ir service.port harperdb
harperdb/templates/NOTES.txt:  echo http://$SERVICE_IP:
harperdb/templates/ingress.yaml:harperdb/templates/service.yaml:    - port: 
harperdb/templates/deployment.yaml:              containerPort: 

Let’ change the service port in values.

$ grep port: harperdb/values.yaml
  port: 80

$ sed -i 's/port: 80/port: 9925/g' harperdb/values.yaml 

$ grep port: harperdb/values.yaml
  port: 9925

Also set the service type to LoadBlancer

$ sed -i 's/type: ClusterIP/type: LoadBalancer/g' harperdb/values.yaml

$ cat harperdb/values.yaml
--TRUNCATED--
service:
  type: LoadBalancer
  port: 9925
--TRUNCATED

Security context

Modify the pod security context, you may check this post to know why we used 1000 as the fsGroup.

$ grep -i -A 2 podSecurityContext harperdb/values.yaml
podSecurityContext:
  fsGroup: 1000

Resources

Similary set the cpu and memory requirements in values.

$ grep -A 6 resources harperdb/values.yaml
resources:
  limits:
    cpu: 500m
    memory: 1Gi
  requests:
    cpu: 100m
    memory: 128Mi

Secret

The chart we created doesn’t have a secret manifest, we can create it. This manifest follows standards similar to the service account manifest.

$ cat <<EOF > harperdb/templates/secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: 
  labels:
stringData:
EOF

We can set the appropriate values for the secret manifest.

$ cat <<EOF >>  harperdb/values.yaml

secret:
  entries:
    HDB_ADMIN_USERNAME: admin
    HDB_ADMIN_PASSWORD: password12345
  create: true
  name: harperdb
EOF

We can then modify the helpers file.

$ cat <<EOF >> harperdb/templates/_helpers.tpl 

EOF

PVC

Likewise, there is no pvc template in the chart. So we can add that.

$ cat <<EOF > harperdb/templates/pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: 
  labels:
spec:
  accessModes:
  - 
  resources:
    requests:
      storage: 
EOF

Set the appropriate values for pvc.

$ cat <<EOF >> harperdb/values.yaml                                                                                                                                                       

pvc:
  accessMode: ReadWriteOnce
  create: true
  mountPath: /opt/harperdb/hdb
  name: harperdb
  storage: 5Gi
EOF

We can then modify the helpers file.

$ cat <<EOF >> harperdb/templates/_helpers.tpl 

EOF

Deployment

We are going to make a few changes to the deployment manifest. So that it looks like below.

$ cat <<EOF > harperdb/templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: 
  labels:
spec:
  replicas: 
  selector:
    matchLabels:
  template:
    metadata:
      annotations:
      labels:
    spec:
      imagePullSecrets:
      serviceAccountName: 
      securityContext:
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: 
      containers:
        - name: 
          volumeMounts:
          - name: data
            mountPath: 
          envFrom:
          - secretRef:
              name: 
          securityContext:
          image: ":"
          imagePullPolicy: 
          ports:
            - name: http
              containerPort: 
              protocol: TCP
          resources:
      nodeSelector:
      affinity:
      tolerations:
EOF

In the template above, we have injected the secrets as env vars in the envFrom section of the container. And made changes related to the volume by defining the volume in the pod and mount it in the container.

Lint

Our chart is kinda ready…

Ok so now let’s do the linting to see if it’s proper.

$ helm lint harperdb
==> Linting harperdb
[INFO] Chart.yaml: icon is recommended

1 chart(s) linted, 0 chart(s) failed

Seems good.

We can now try generating the kubernetes manifests, it won’t deploy yet. You can try helm template harperdb or helm template harperdb --debug, the debug flag helps debugging issues.

Kubeconfig

Make sure you have a running kubernetes cluster. I have an EKS cluster, and I would be using the aws cli to update the kubeconfig.

$ aws eks update-kubeconfig --name k8s-cluster-01 --region us-east-1

There are two nodes in my cluster.

$ kubectl get nodes
NAME                             STATUS   ROLES    AGE   VERSION
ip-192-168-22-158.ec2.internal   Ready    <none>   24d   v1.23.13-eks-fb459a0
ip-192-168-38-226.ec2.internal   Ready    <none>   24d   v1.23.13-eks-fb459a0

Create a namespace with kubectl.

$ kubectl create ns harperdb

Dry run

As the cluster is ready we can try to do a dry run installation with helm.

$ helm install harperdb harperdb -n harperdb --dry-run --debug

If there are no errors, we can proceed to the packaging.

Package

Our chart seems good so we can package it.

$ helm package harperdb

This should create a compressed file.

$ ls | grep tgz
harperdb-0.1.0.tgz

Here 0.1.0 refers to the chart version.

$ cat harperdb/Chart.yaml | grep version:
version: 0.1.0

Repo

We should need a repo where we can keep this package. I am using this repo for this purpose. And this repo is also setup with GitHub pages and the website is accessible on this URL. So you may create a github repo with pages setup.

Alright I am cloning my repo.

git clone git@github.com:networkandcode/networkandcode.github.io.git

Create a directory there for helm packages.

$ cd networkandcode.github.io/
$ mkdir helm-packages

We can move the package we created earlier in to this directory.

$ mv ~/harperdb-0.1.0.tgz helm-packages/

$ ls helm-packages/
harperdb-0.1.0.tgz

We need to now create an index file.

$ helm repo index helm-packages/

$ ls helm-packages/
harperdb-0.1.0.tgz  index.yaml

The index file is populated automatically with these details.

$ cat helm-packages/index.yaml
apiVersion: v1
entries:
  harperdb:
  - apiVersion: v2
    appVersion: 4.0.1
    created: "2023-02-02T05:58:37.022518464Z"
    description: A Helm chart for Kubernetes
    digest: 1282e5919f2d6889f1e3dd849f27f2992d8288087502e1872ec736240dfd6ebf
    name: harperdb
    type: application
    urls:
    - harperdb-0.1.0.tgz
    version: 0.1.0
generated: "2023-02-02T05:58:37.020383374Z"

You can also add the artifacthub repo file, to claim ownership, it’s optional though.

Ok we can now push the changes to GitHub, note that I am directly pushing to the master branch.

$ git add --all
$ git commit -m 'add helm package for harperdb'

$ git push

Add repository

Our repository is ready with the package, we need to add it to artifact hub. Login to artifact hub, and go to control panel and click add repository.

The repository is added, but it takes some to process. You need to wait until there is a green tick in the Last processed section.

Search again

Once the repo is processed, we can repeat the searching process we did while starting this post.

Well, worth to know that ChatGPT’s knowledge is cut off in 2021.

Now let’s do the cli way.

$ helm search hub harperdb --max-col-width 1000
URL                                                             CHART VERSION   APP VERSION     DESCRIPTION
https://artifacthub.io/packages/helm/networkandcode/harperdb    0.1.0           4.0.1           A Helm chart for Kubernetes

Wow our chart is showing up…

Install

We can open the URL shown above and see the installation instructions.

Let’s run those commands, I am going to use -n for installing it in a separate namespace.

$ helm repo add networkandcode https://networkandcode.github.io/helm-packages
"networkandcode" has been added to your repositories

$ helm install my-harperdb networkandcode/harperdb --version 0.1.0 -n harperdb

Validate

Alright, so the release is installed, it’s time to validate. First let’s check the helm release status.

$ helm list -n harperdb
NAME            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART           APP VERSION
my-harperdb     harperdb        1               2023-02-02 07:14:59.586892384 +0000 UTC deployed        harperdb-0.1.0  4.0.1

Check the Kubernetes workloads.

$ kubectl get all -n harperdb
NAME                               READY   STATUS    RESTARTS   AGE
pod/my-harperdb-7b66d4f7c5-xtpvw   1/1     Running   0          2m7s

NAME                  TYPE           CLUSTER-IP       EXTERNAL-IP                                                               PORT(S)          
AGE
service/my-harperdb   LoadBalancer   10.100.127.117   a6e762ccc1e2d482a8528a7760544761-2140283724.us-east-1.elb.amazonaws.com   
9925:30478/TCP   2m9s

NAME                          READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/my-harperdb   1/1     1            1           2m8s

NAME                                     DESIRED   CURRENT   READY   AGE
replicaset.apps/my-harperdb-7b66d4f7c5   1         1         1       2m9s

API

We can test schema creation with an API call.

$ HDB_API_ENDPOINT_HOSTNAME=$(kubectl get svc my-harperdb -n harperdb -o jsonpath={.status.loadBalancer.ingress[0].hostname})

$ curl --location --request POST http://${HDB_API_ENDPOINT_HOSTNAME}:9925 --header 'Content-Type: application/json' --header 'Authorization: 
Basic YWRtaW46cGFzc3dvcmQxMjM0NQ==' --data-raw '{
    "operation": "create_schema",
    "schema": "my-schema"
}'
{"message":"schema 'my-schema' successfully created"}

Note that I have parsed the hostname as it gives a hostname for the external IP in EKS. Well, so the API call was successful. Nice, we were able to accomplish the goal !!!

Clean up

I am going to just clean up the helm and Kubernetes objects.

$ helm uninstall my-harperdb -n harperdb
release "my-harperdb" uninstalled

$ kubectl delete ns harperdb
namespace "harperdb" deleted

Summary

So we have seen some constructs of helm, and understood how we can make a chart for harperdb, push it to artifact hub and subsequently use it to install a harperdb release. Note that you can customise the chart with more options such as adding readme, enabling tests, claiming ownership for the chart, adding more harperdb specific variables w.r.t clustering, custom functions etc.

Thank you for reading!!!

HarperDB with Anthos on GKE

2023-01-18T00:00:00+00:00

This post first appeared on dev.to

Introduction

Anthos is a service from Google cloud using which we can deploy and manage workloads using various options such as cloud run, GKE, self managed clusters, hybrid cloud clusters, edge based workloads and so on. In this post, we would focus on the autopilot GKE based cluster and deploy HarperDB on it with a Helm chart. Please check this link for the helm chart, we use in this post.

Let’s get started…

Enable Anthos

Search for Anthos on the Google cloud console and enable the Anthos API.

GKE

We would be going with an Anthos managed GKE cluster, so click on the configure option in the auto pilot option.

I have kept all options to their defaults. Wait for the cluster to get created. The notifications link should show the status of creation.

We should now have a GKE cluster by name autopilot-cluster-1. ![Notification for cluster creation completed] (https://dev-to-uploads.s3.amazonaws.com/uploads/articles/2h66wwysiu3al2u44ync.png)

Refresh the page to see the cluster.

Register

We have created the GKE cluster via Anthos, however we also need to register it. Click register and go back to the clusters page, the cluster should show under Anthos managed clusters.

We can now use the GKE cluster to launch any applications in the usual way with kubectl or helm.

Kubeconfig

Go the cloud shell and run the following command to update kubeconfig.

$ gcloud container clusters get-credentials autopilot-cluster-1 --region us-central1
Fetching cluster endpoint and auth data.
kubeconfig entry generated for autopilot-cluster-1.

Both kubectl and helm would use this kubeconfig to interact with the cluster.

Namespace

Let’s create a namespace with kubectl.

$ kubectl create ns harperdb
namespace/harperdb created

Helm

We can deploy the Kubernetes objects with Helm, for which let’s check if there is any publicly available helm chart for harperdb chart in the artifact hub.

$ helm search hub harperdb
No results found

We don’t have a chart yet on the hub. Hence, I would be using a local minimal helm chart, whose files are as follows.

$ ls harperdb -tR
harperdb:
templates  Chart.yaml  values.yaml

harperdb/templates:
svc.yaml  deploy.yaml  pvc.yaml  secret.yaml

For more info on the contents of files in this chart, pl. checkout this post.

Alright, so let’s install our helm release on the GKE cluster managed by Anthos.

$ helm install harperdb . -n harperdb
W0114 05:34:47.808566     817 warnings.go:70] Autopilot set default resource requests for Deployment harperdb/harperdb, as resource requests 
were not specified. See http://g.co/gke/autopilot-defaults
NAME: harperdb
LAST DEPLOYED: Sat Jan 14 05:34:39 2023
NAMESPACE: harperdb
STATUS: deployed
REVISION: 1
TEST SUITE: None

Resources

Though we have not setup any resource requests in the deployment template, the autopilot cluster had enabled it for us. We can check the deployment spec to see what they have set. Note that it’s a good practice to mention resource requests and limits in the template.

$ kubectl get deploy harperdb -n harperdb -o jsonpath={.spec.template.spec.containers[].resources} | jq
{
  "limits": {
    "cpu": "500m",
    "ephemeral-storage": "1Gi",
    "memory": "2Gi"
  },
  "requests": {
    "cpu": "500m",
    "ephemeral-storage": "1Gi",
    "memory": "2Gi"
  }
}

Pod

Check the pod status.

$ kubectl get po -n harperdb
NAME                        READY   STATUS    RESTARTS   AGE
harperdb-559d48f4f7-6dftw   1/1     Running   0          6m9s

API test

The pod is running, we can get the service IP and try sending an API call.

$ HDB_API_ENDPOINT_IP=$(kubectl get svc harperdb -n harperdb -o jsonpath={.status.loadBalancer.ingress[0].ip})

$ curl --location --request POST http://${HDB_API_ENDPOINT_IP}:8080 --header 'Content-Type: application/json' --header 'Authorization: Basic 
YWRtaW46cGFzc3dvcmQxMjM0NQ==' --data-raw '{
    "operation": "create_schema",
    "schema": "prod"
}'
{"message":"schema 'prod' successfully created"}

So our installation went smooth and it’s working.

Clean up

The clean up involves four steps. Deleting the helm chart directory from cloudshell rm -rf harperdb.

Unregistering the cluster from the Anthos console.

Delete the cluster from the GKE console.

Finally disable the Anthos API.

$ gcloud services disable anthos.googleapis.com
Warning: Disabling this service will also automatically disable any running Anthos clusters.

Do you want to continue (y/N)?  y

Summary

So we have seen how to create an autopilot cluster from the Anthos console, installed a helm release for HarperDB on it, and tested it with a sample schema creation. Thank you for reading !!!

HarperDB with Helm on EKS

2023-01-13T00:00:00+00:00

This post first appeared on dev.to

This post builds on top of this where we discussed how to launch HarperDB on EKS with kubectl. Here, we would use those same manifests. But, we would go with the helm cli tool instead of kubectl for the deployment. Please check this link for the final chart we make in this post.

Get set go :fire:

First, ensure helm is installed. If you want to use AWS cloud shell for this, please click on the shell icon on the top navigation bar of the cloud console, and execute the following.

$ curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3
$ chmod 700 get_helm.sh
$ sudo yum install openssl -y
$ ./get_helm.sh

Ensure the kubeconfig is updated.

$ aws eks update-kubeconfig --name k8s-cluster-01

Pl. change the cluster name above with what you have created.

Let’s just create the namespace with kubectl, rest will be created with helm.

$ kubectl create ns harperdb
namespace/harperdb created

Create a directory for the helm chart.

$ mkdir harperdb
$ cd harperdb

Create a directory for templates in the chart.

$ mkdir templates

Copy the manifests from this post and keep it in the templates directory.

$ ls templates/
deploy.yaml  pvc.yaml  secret.yaml  svc.yaml

Create the chart file where we would just keep the name and version, we are keeping to hold just the mandatory information, though we can add more.

$ cat <<EOF > Chart.yaml 
name: harperdb
version: 1.0.0
EOF

So what we created so far is our harperdb helm chart, we are staring with chart version 1.0.0.

We can use this to install the release in a separate namespace. The namespace below means the helm release namespace and not the kubernetes namespace. And . refers to the chart directory which is the current directory.

$ helm install harperdb . -n harperdb
NAME: harperdb
LAST DEPLOYED: Fri Jan 13 09:52:08 2023
NAMESPACE: harperdb
STATUS: deployed
REVISION: 1
TEST SUITE: None

We are using the same name for both the helm and kubernetes namespaces. Note that the kubernetes namespace is mentioned in all of the manifests.

$ cat templates/deploy.yaml | grep namespace
  namespace: harperdb

The helm release is deployed.

$ helm list -n harperdb
NAME            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART           APP VERSION
harperdb        harperdb        1               2023-01-13 09:52:08.953758683 +0000 UTC deployed        harperdb-1.0.0

This has deployed the workloads in kubernetes, which we can check with kubectl.

$ kubectl get deploy,pvc,secret,svc -n harperdb
NAME                       READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/harperdb   1/1     1            1           5m11s

NAME                             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
persistentvolumeclaim/harperdb   Bound    pvc-6ef558ee-86ad-460d-8379-c34a12aaf283   5Gi        RWO            gp2            5m11s

NAME                                    TYPE                                  DATA   AGE
secret/default-token-llbqm              kubernetes.io/service-account-token   3      3h50m
secret/harperdb                         Opaque                                2      5m11s
secret/sh.helm.release.v1.harperdb.v1   helm.sh/release.v1                    1      5m11s

NAME               TYPE           CLUSTER-IP     EXTERNAL-IP                                                               PORT(S)          
AGE
service/harperdb   LoadBalancer   10.100.29.79   a86501cc7a7024fff92b7212cd844e45-1677629392.us-east-1.elb.amazonaws.com   8080:31549/TCP   
5m11s

Let’s test with an API call. The endpoint refers to the loadbalancer service.

$ HDB_API_ENDPOINT=a86501cc7a7024fff92b7212cd844e45-1677629392.us-east-1.elb.amazonaws.com:8080

$ curl --location --request POST ${HDB_API_ENDPOINT} --header 'Content-Type: application/json' --header 'Authorization: Basic 
YWRtaW46cGFzc3dvcmQxMjM0NQ==' --data-raw '{
    "operation": "create_schema",
    "schema": "uat" 
}'
{"message":"schema 'uat' successfully created"}

Awesome, works !!!.

Let’s now revisit the helm chart and do a couple changes. For now, we haven’t used any values file, which is commonly used with any helm chart. We need values when we would like to parameterize certain things in our manifests. For instance, let’s say we want to parameterize the service port.

$ cat <<EOF > values.yaml 
servicePort: 8080
EOF

We can the change the port section in the service template to refer to the value from values file.

$ cat templates/svc.yaml | grep port
  ports:
    port: 

Let’s call this chart version 2.0.0 and upgrade our release.

$ cat Chart.yaml | grep version
version: 2.0.0

$ helm upgrade harperdb . -n harperdb
Release "harperdb" has been upgraded. Happy Helming!
NAME: harperdb
LAST DEPLOYED: Fri Jan 13 12:35:53 2023
NAMESPACE: harperdb
STATUS: deployed
REVISION: 2
TEST SUITE: None

We can verify the service port.

$ kubectl get svc harperdb -n harperdb -o jsonpath={.spec.ports[0].port}
8080

Let’s do another change, this time we would try to refer to the release name and namespace as the name and namespace for each of the resources.

$ grep -ir .Release templates/
templates/pvc.yaml:  name: 
templates/pvc.yaml:  namespace: 
templates/svc.yaml:  name: 
templates/svc.yaml:  namespace: 
templates/deploy.yaml:  name: 
templates/deploy.yaml:  namespace: 
templates/secret.yaml:  name: 
templates/secret.yaml:  namespace: 

Note you have to modify each file, so that the grep output looks like above.

Let’s change the chart version to 3.0.0, and upgrade the release.

$ cat Chart.yaml 
name: harperdb
version: 3.0.0

$ helm upgrade harperdb . -n harperdb

We can validate.

$ kubectl get deploy,pvc,secret,svc -n harperdb | awk '{print $1}'
NAME
deployment.apps/harperdb

NAME
persistentvolumeclaim/harperdb

NAME
secret/default-token-llbqm
secret/harperdb
secret/sh.helm.release.v1.harperdb.v1

NAME
service/harperdb

You could see the workload details on the EKS page. Here is a sample screenshot.

All good, so we have reached the end of this post, we saw how to install harperdb with a minimal helm chart, tested it with an API call, and tweaked the helm chart a bit to understand some fundamentals of it. Thank you for reading !!!.

HarperDB on EKS

2023-01-09T00:00:00+00:00

This post first appeared on dev.to

Hi there :wave:, let’s see how to deploy HarperDB on EKS, and then test it with an API call from CURL. You can get the Kubernetes manifests that we make in this post from this link.

Hope you are already familiar with topics such as Deployment, Load Balancer service, Secret and Persistent volume claim

Ensure you have the required IAM permissions, have installed the aws, eksctl & kubectl cli tools, and have setup the config and credentials.

For me the config is as follows.

$ cat ~/.aws/config
[default]
region=us-east-1

Cluster

We can now create an EKS cluster with eksctl. You may see this video for cluster creation from the CLI.

$ eksctl create cluster --name eks-cluster --zones=us-east-1a,us-east-1b

This has taken around 20 mins for me. Once it’s done we can update the kubeconfig.

$ aws eks update-kubeconfig --name eks-cluster

Docker hub

We can visit the docker hub page of harperdb to get an idea on the ports, environment variables, volume path etc.

They have given an example docker command as below.

docker run -d \
  -v /host/directory:/opt/harperdb/hdb \
  -e HDB_ADMIN_USERNAME=HDB_ADMIN \
  -e HDB_ADMIN_PASSWORD=password \
  -p 9925:9925 \
  harperdb/harperdb

This tells us the volume mount path in the container is /opt/harperdb/hdb, there are 2 environment variables for username and password, and the container port is 9925. Finally the image is harperdb/harperdb.

We now have enough info to start writing our Kubernetes manifests.

Kubernetes manifests

I am going to create a directory by name harperdb where I would keep all the manifests.

$ mkdir harperdb    
$ cd harperdb

Let’s begin with the environment variables, we can write both username and password in a secret object.

$ cat <<EOF > secret.yaml 
---
apiVersion: v1
kind: Secret
metadata:
  name: harperdb
  namespace: harperdb
stringData:
  HDB_ADMIN_USERNAME: admin
  HDB_ADMIN_PASSWORD: password12345
...
EOF

We can now go with a persistent volume claim, that can dynamically create an EBS volume of size 5Gi in AWS.

$ cat <<EOF > pvc.yaml 
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: harperdb
  namespace: harperdb
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
...
EOF

Then comes the deployment manifest, where we can define the container image, refer to the secret for the env vars, and pvc for the volume. Note that the volume mount path matches with that in the docker command.

$ cat <<EOF > deploy.yaml 
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: harperdb
  namespace: harperdb
spec:
  selector:
    matchLabels:
      app: harperdb
  template:
    metadata:
      labels:
        app: harperdb
    spec:
      containers:
      - name: harperdb
        image: harperdb/harperdb
        envFrom:
        - secretRef:
            name: harperdb
        volumeMounts:
        - name: data
          mountPath: /opt/harperdb/hdb
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: harperdb
...
EOF

Finally, we have to expose the deployment with a service, we know from the docker command that the container port is 9925.

$ cat <<EOF > svc.yaml
---
apiVersion: v1
kind: Service
metadata:
  name: harperdb
  namespace: harperdb
spec:
  selector:
    app: harperdb
  type: LoadBalancer
  ports:
  - name: http
    port: 8080
    targetPort: 9925
...
EOF

Note that we have used 8080 as the service port.

Workloads

Create a namespace by name harperdb, where we can create our objects.

$ kubectl create ns harperdb
namespace/harperdb created

We are good to create objects with the 4 manifests.

$ ls
deploy.yaml	pvc.yaml	secret.yaml	svc.yaml

$ kubectl create -f .
deployment.apps/harperdb created
persistentvolumeclaim/harperdb created
secret/harperdb created
service/harperdb created

Fix PVC

The pvc should be in pending status.

$ kubectl get pvc
NAME       STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
harperdb   Pending                                      gp2            7m3s

Please follow this link to add IAM role in AWS cloud, and ebs csi objects on the cluster. This should fix the PVC issue.

Once done, the pvc should be bound to a persistent volume(pv).

$ kubectl get pvc -n harperdb    
NAME       STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
harperdb   Bound    pvc-7c83e38c-b00a-4194-8c67-ba5c9c1118e7   5Gi        RWO            gp2            9s

And the pv should be mapped to an EBS volume.

$ kubectl get pv 
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                  STORAGECLASS   REASON   
AGE
pvc-7c83e38c-b00a-4194-8c67-ba5c9c1118e7   5Gi        RWO            Delete           Bound    harperdb/harperdb      gp2                     
64s

$ kubectl describe pv pvc-7c83e38c-b00a-4194-8c67-ba5c9c1118e7 | grep VolumeID
    VolumeID:   vol-0bbca736346f02aa1

Note that a persistent volume is a cluster level object and not bound to a namespace. We can check the volume details from the aws cli.

$ aws ec2 describe-volumes --volume-ids vol-0bbca736346f02aa1 --query "Volumes[0].Size"      
5

$ aws ec2 describe-volumes --volume-ids vol-0bbca736346f02aa1 --query "Volumes[0].Tags"
[
    {
        "Key": "ebs.csi.aws.com/cluster",
        "Value": "true"
    },
    {
        "Key": "CSIVolumeName",
        "Value": "pvc-7c83e38c-b00a-4194-8c67-ba5c9c1118e7"
    },
    {
        "Key": "kubernetes.io/created-for/pv/name",
        "Value": "pvc-7c83e38c-b00a-4194-8c67-ba5c9c1118e7"
    },
    {
        "Key": "kubernetes.io/created-for/pvc/namespace",
        "Value": "harperdb"
    },
    {
        "Key": "kubernetes.io/created-for/pvc/name",
        "Value": "harperdb"
    }
]

Volume permission fix

So the pvc seems good. Let’s check our application status.

$ kubectl get po -n harperdb
NAME                        READY   STATUS             RESTARTS      AGE
harperdb-79694c8b75-6ckn7   0/1     CrashLoopBackOff   4 (80s ago)   3m25s

The application was crashing, but the volume was getting mounted, and the env vars were fine too. I tried commenting out volumeMounts and volume and updated the deployment.

$ cat deploy.yaml | grep #
        #volumeMounts:
        #- name: data
          #mountPath: /opt/harperdb/hdb
      #volumes:
      #- name: data
        #persistentVolumeClaim:
          #claimName: harperdb

$ kubectl apply -f deploy.yaml

The pod was running, and I checked the permissions of the directory where we need to mount the volume. And subsequently the id of the group.

$ kubectl exec -it deploy/harperdb -n harperdb -- bash

ubuntu@harperdb-858cc7967d-5jcqm:~$ ls -l /opt/harperdb
total 0
drwxr-xr-x 11 ubuntu ubuntu 155 Jan  9 06:59 hdb

ubuntu@harperdb-858cc7967d-5jcqm:~$ id
uid=1000(ubuntu) gid=1000(ubuntu) groups=1000(ubuntu)

ubuntu@harperdb-858cc7967d-5jcqm:~$ exit

So the group id of the running user is 1000, hence we can set this as the group owner for the volume directory with the fsGroup option. If we don’t specify this then the mountPath would by default be set with root(user) and root(group) as the owner for the directory and the running user ubuntu wouldn’t have permissions on the mountPath to create any new files. This video has information about fsGroup.

We have to change the deployment as follows. We have added the security context with the fsGroup.

$ cat deploy.yaml 
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: harperdb
  namespace: harperdb
spec:
  selector:
    matchLabels:
      app: harperdb
  template:
    metadata:
      labels:
        app: harperdb
    spec:
      securityContext:
        fsGroup: 1000
      containers:
      - name: harperdb
        image: harperdb/harperdb
        envFrom:
        - secretRef:
            name: harperdb
        volumeMounts:
        - name: data
          mountPath: /opt/harperdb/hdb
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: harperdb
...

Alternately, we could also set mountPath to just /opt/harperdb, where we wouldn’t have to set the securityContext. But I thought this is a good use case to know about the fsGroup.

Update the deployment.

$ kubectl apply -f deploy.yaml

Check the workloads.

$ kubectl get all -n harperdb
NAME                           READY   STATUS    RESTARTS   AGE
pod/harperdb-cc4f49dfc-m7d5p   1/1     Running   0          55s

NAME               TYPE           CLUSTER-IP     EXTERNAL-IP                                                              PORT(S)          
AGE
service/harperdb   LoadBalancer   10.100.54.78   a0ba701c9c5a4463bb636551c79b4158-169592876.us-east-1.elb.amazonaws.com   8080:31819/TCP   
55s

NAME                       READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/harperdb   1/1     1            1           57s

NAME                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/harperdb-cc4f49dfc   1         1         1       57s

API call

Send a CURL command to test schema creation. The endpoint is from the external IP column in the service. You may check this video to know how to obtain the curl command for harperdb.

$ HDB_API_ENDPOINT=http://a0ba701c9c5a4463bb636551c79b4158-169592876.us-east-1.elb.amazonaws.com:8080

$ curl --location --request POST ${HDB_API_ENDPOINT} \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic YWRtaW46cGFzc3dvcmQxMjM0NQ==' \
--data-raw '{
    "operation": "create_schema",
    "schema": "qa" 
}'

{"message":"schema 'qa' successfully created"}

All good, it’s working…

Persistence

Test persistence by deleting the pod.

$ kubectl delete po -n harperdb -l app=harperdb
pod "harperdb-cc4f49dfc-m7d5p" deleted

This should launch a new pod.

$ kubectl get po -n harperdb
NAME                       READY   STATUS    RESTARTS   AGE
harperdb-cc4f49dfc-c6vnc   1/1     Running   0          57s

We can try sending the same API call again.

$ curl --location --request POST ${HDB_API_ENDPOINT} \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic YWRtaW46cGFzc3dvcmQxMjM0NQ==' \
--data-raw '{
    "operation": "create_schema",
    "schema": "qa"
}'

{"error":"Schema 'qa' already exists"}

It’s not creating a new schema, because the existing schema is restored from the attached volume. Hence, it’s persistent.

Clean up

Let’s do the clean up…

Delete all the objects that were created via manifests.

$ kubectl delete -f .
deployment.apps "harperdb" deleted
persistentvolumeclaim "harperdb" deleted
secret "harperdb" deleted
service "harperdb" deleted

Then delete the namespace.

$ kubectl delete ns harperdb
namespace "harperdb" deleted

Delete the folder.

$ cd ..
$ rm -rf harperdb

Finally delete the cluster.

$ eksctl delete cluster --name eks-cluster

That’s it for the post, Thank you for reading !!!

GCP Anthos Cluster on AWS

2022-10-08T00:00:00+00:00

This post first appeared on dev.to

Anthos is a software offering from Google, using which we can build kubernetes clusters on nodes both on and off cloud. In this post, we would launch an EC2 instance in AWS and build a single node kubernetes cluster on it with Anthos.

Let’s get started.

AWS cloud shell

[cloudshell-user@ip-10-0-89-211 ~]$

EC2

Create an EC2 instance with other relevant components with these commands. Please refer to this post for explanation on what each of these commands does.

$ mkdir ~/.aws

$ cat ~/.aws/config <<EOF
[default]
region=ap-south-1
EOF

$ export CIDR_BLOCK="10.10.10.0/28"
$ aws ec2 create-vpc --cidr-block $CIDR_BLOCK

$ export ANTHOS_VPC_ID=$(aws ec2 describe-vpcs | jq -r '.Vpcs[] | select(.CidrBlock == env.CIDR_BLOCK) | .VpcId')

$ aws ec2 create-internet-gateway --tag-specifications 'ResourceType=internet-gateway,Tags=[{Key=Name,Value=anthos-igw}]'

$ export ANTHOS_IGW_ID=$(aws ec2 describe-internet-gateways --filters Name=tag:Name,Values=anthos-igw --query 
"InternetGateways[*].InternetGatewayId" --output text)

$ aws ec2 attach-internet-gateway --internet-gateway-id $ANTHOS_IGW_ID --vpc-id $ANTHOS_VPC_ID

$ export ANTHOS_RTB_ID=$(aws ec2 describe-route-tables | jq -r '.RouteTables[] | select(.VpcId == env.ANTHOS_VPC_ID) | .RouteTableId')

$ aws ec2 create-route --route-table-id $ANTHOS_RTB_ID --destination-cidr-block 0.0.0.0/0 --gateway-id $ANTHOS_IGW_ID

$ aws ec2 create-subnet --cidr-block $CIDR_BLOCK --vpc-id $ANTHOS_VPC_ID

$ export ANTHOS_SUBNET_ID=$(aws ec2 describe-subnets | jq -r '.Subnets[] | select(.CidrBlock == env.CIDR_BLOCK) | .SubnetId')

$ export ANTHOS_AVAILABILITY_ZONE=$(aws ec2 describe-subnets | jq -r '.Subnets[] | select(.CidrBlock == env.CIDR_BLOCK) | 
.AvailabilityZone')

$ aws ec2 create-security-group --group-name anthos-sg --description "anthos security group" --vpc-id $ANTHOS_VPC_ID

$ export ANTHOS_SG_ID=$(aws ec2 describe-security-groups | jq -r '.SecurityGroups[] | select(.GroupName == "anthos-sg") | .GroupId')

$ aws ec2 describe-instance-types | jq '.InstanceTypes[] | select(.MemoryInfo.SizeInMiB == 7680) | (.InstanceType, .VCpuInfo.DefaultVCpus)'
"c4.xlarge"
4

$ aws ec2 describe-instance-types | jq '.InstanceTypes[] | select(.MemoryInfo.SizeInMiB == 8192) | select (.VCpuInfo.DefaultVCpus == 2) | 
.InstanceType' | sort
"m4.large"
"m5ad.large"
"m5a.large"
"m5d.large"
"m5.large"
"m6a.large"
"m6gd.large"
"m6g.large"
"m6i.large"
"t2.large"
"t3a.large"
"t3.large"
"t4g.large"

$ aws ec2 describe-instance-type-offerings --location-type availability-zone | jq '.InstanceTypeOfferings[] | select(.Location == 
env.ANTHOS_AVAILABILITY_ZONE) | .InstanceType' | grep t2.large
"t2.large"

$ aws ec2 create-key-pair --key-name anthosKeyPair --query 'KeyMaterial' --output text > anthosKeyPair.pem
$ mkdir .ssh
$ mv anthosKeyPair.pem ~/.ssh/

$ aws ec2 run-instances --image-id ami-0bba4b75264ecbfbd --count 1 --instance-type t2.large --key-name anthosKeyPair --security-group-ids 
$ANTHOS_SG_ID --subnet-id $ANTHOS_SUBNET_ID --associate-public-ip-address --block-device-mappings 
'DeviceName=/dev/sda1,Ebs={VolumeSize=200}'

The instance is now created. We can give it a name.

$ ANTHOS_INSTANCE_ID=$(aws ec2 describe-instances | jq -r '.Reservations[] | .Instances[] | select(.SubnetId==env.ANTHOS_SUBNET_ID) | 
.InstanceId')

$ aws ec2 create-tags --resources  $ANTHOS_INSTANCE_ID --tags Key=Name,Value=anthos-node

SSH

In order to SSH from the cloud shell to the Anthos instance, we first need to obtain the public IP of the cloudshell and add a rule in the security group to allow SSH access from that.

$ export MY_PUBLIC_IP=$(curl ifconfig.me --silent)
$ aws ec2 authorize-security-group-ingress --group-id $ANTHOS_SG_ID --protocol tcp --port 22 --cidr $MY_PUBLIC_IP/32

Copy the SSH key pair to the instance as it’s needed in the Anthos cluster config.

$ export ANTHOS_INSTANCE_IP=$(aws ec2 describe-instances --filter Name=tag:Name,Values=anthos-node --query 
"Reservations[*].Instances[*].PublicIpAddress" --output text)

$ scp -i ~/.ssh/anthosKeyPair.pem ~/.ssh/anthosKeyPair.pem ubuntu@$ANTHOS_INSTANCE_IP:~/.ssh/anthosKeyPair.pem

SSH into the instance.

$ ssh -i ~/.ssh/anthosKeyPair.pem ubuntu@$ANTHOS_INSTANCE_IP

Install gcloud

Install the gcloud cli on the Anthos instance.

sudo apt-get install apt-transport-https ca-certificates gnupg -y

echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] https://packages.cloud.google.com/apt cloud-sdk main" | sudo tee -a 
/etc/apt/sources.list.d/google-cloud-sdk.list

curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key --keyring /usr/share/keyrings/cloud.google.gpg add -

sudo apt-get update -y && sudo apt-get install google-cloud-cli -y

This step is optional. Login to the gcloud cli with your account if you are going to create a service account yourself from the CLI.

gcloud auth login

Authenticaion

Create a service account in GCP and give it the following roles.

roles/gkehub.connect
roles/gkehub.admin
roles/logging.logWriter
roles/monitoring.metricWriter
roles/monitoring.dashboardEditor
roles/stackdriver.resourceMetadata.writer
roles/opsconfigmonitoring.resourceMetadata.writer

If you are using the gcloud CLI, you can create a service account and bind the roles with the following command.

gcloud iam service-accounts create <service-account-name>

gcloud projects add-iam-policy-binding "$PROJECT_ID" \
  --member=<service-account-client-email> \
  --role=<role> \
  --no-user-output-enabled

Create a key for the service account and copy it’s credentials.

gcloud iam service-accounts keys create <key-file-path> \
    --iam-account=${service-account-name}@${PROJECT_ID}.iam.gserviceaccount.com

Setup the credentials in the instance, activate the service account and set the project id.

$ mkdir .gcloud
$ export PROJECT_ID=<project_id>

$ cat > .gcloud/keyfile.json << EOF 
{
   "type": "service_account",
   "project_id": $PROJECT_ID,
   "private_key_id": "<private_key_id>",
   "private_key": <private_key>,
   "client_email": <client_email>,
   "client_id": <client_id>,
   "auth_uri": "https://accounts.google.com/o/oauth2/auth",
   "token_uri": "https://oauth2.googleapis.com/token",
   "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
   "client_x509_cert_url": <client_x509_cert_url>
}
EOF

$ gcloud auth activate-service-account <client_email> --key-file .gcloud/keyfile.json

$ export GOOGLE_APPLICATION_CREDENTIALS='/home/ubuntu/.gcloud/keyfile.json'

$ gloud config set project $PROJECT_ID

APIs

Enable the services.

gcloud services enable \
    anthos.googleapis.com \
    anthosaudit.googleapis.com \
    anthosgke.googleapis.com \
    cloudresourcemanager.googleapis.com \
    container.googleapis.com \
    gkeconnect.googleapis.com \
    gkehub.googleapis.com \
    serviceusage.googleapis.com \
    stackdriver.googleapis.com \
    monitoring.googleapis.com \
    logging.googleapis.com \
    opsconfigmonitoring.googleapis.com

Other tools

Install kubectl, bmctl and docker.

curl -LO "https://storage.googleapis.com/kubernetes-release/release/$(curl -s 
https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl"

chmod +x kubectl
sudo mv kubectl /usr/local/sbin/

gsutil cp gs://anthos-baremetal-release/bmctl/1.13.0/linux-amd64/bmctl .
chmod a+x bmctl
sudo mv bmctl /usr/local/sbin/

curl -O https://download.docker.com/linux/static/stable/x86_64/docker-20.10.9.tgz
tar xzvf docker-20.10.9.tgz
sudo cp docker/* /usr/bin/
rm -rf docker*
sudo groupadd docker
sudo usermod -aG docker $USER
sudo dockerd&
newgrp docker

VxLAN

Set up vxlan with the IP 10.200.2/24.

sudo ip link add vxlan0 type vxlan id 42 dev eth0 dstport 0
sudo ip addr add 10.200.0.2/24 dev vxlan0
sudo ip link set up dev vxlan0

Cluster config

Create the Anthos cluster config with bmctl.

export CLUSTER_ID=anthos-aws
bmctl create config -c $CLUSTER_ID

Change the config.

$ cat > bmctl-workspace/${CLUSTER_ID}/${CLUSTER_ID}.yaml << EOF
---
gcrKeyPath: /home/ubuntu/.gcloud/keyfile.json
sshPrivateKeyPath: /home/ubuntu/.ssh/anthosKeyPair.pem
gkeConnectAgentServiceAccountKeyPath: /home/ubuntu/.gcloud/keyfile.json
gkeConnectRegisterServiceAccountKeyPath: /home/ubuntu/.gcloud/keyfile.json
cloudOperationsServiceAccountKeyPath: /home/ubuntu/.gcloud/keyfile.json
---
apiVersion: v1
kind: Namespace
metadata:
  name: cluster-${CLUSTER_ID}
---
apiVersion: baremetal.cluster.gke.io/v1
kind: Cluster
metadata:
  name: ${CLUSTER_ID}
  namespace: cluster-${CLUSTER_ID}
spec:
  profile: edge
  type: standalone
  anthosBareMetalVersion: 1.13.0
  gkeConnect:
    projectID: $PROJECT_ID
  controlPlane:
    nodePoolSpec:
      clusterName: ${CLUSTER_ID}
      nodes:
      - address: 10.200.0.2
  clusterNetwork:
    pods:
      cidrBlocks:
      - 192.168.0.0/16
    services:
      cidrBlocks:
      - 172.26.232.0/24
  loadBalancer:
    mode: bundled
    ports:
      controlPlaneLBPort: 443
    vips:
      controlPlaneVIP: 10.200.0.49
      ingressVIP: 10.200.0.50
    addressPools:
    - name: pool1
      addresses:
      - 10.200.0.50-10.200.0.70
  clusterOperations:
    location: asia-south1
    projectID: $PROJECT_ID
  storage:
    lvpNodeMounts:
      path: /mnt/localpv-disk
      storageClassName: node-disk
    lvpShare:
      numPVUnderSharedPath: 5
      path: /mnt/localpv-share
      storageClassName: local-shared
  nodeConfig:
    podDensity:
      maxPodsPerNode: 64
  nodeAccess:
    loginUser: ubuntu
EOF

Cluster creation

Create the cluster.

$ bmctl create cluster -c ${CLUSTER_ID}

The above command should take some time and once it’s successful, the kubernetes cluster should be ready.

$ export KUBECONFIG=bmctl-workspace/${CLUSTER_ID}/${CLUSTER_ID}-kubeconfig

$ kubectl get nodes
NAME          STATUS   ROLES                  AGE    VERSION
ip-10-0-0-9   Ready    control-plane,master   153m   v1.24.2-gke.1900

The cluster should appear on the Anthos clusters plage on GCP.

Run workloads

Test with a sample nginx deployment.

$ cat > deploy.yaml << EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
  selector:
    matchLabels:
      app: nginx
EOF

$ kubectl create -f deploy.yaml

$ kubectl get deploy
NAME    READY   UP-TO-DATE   AVAILABLE   AGE
nginx   1/1     1            1           65s

$ kubectl get pod
NAME                    READY   STATUS    RESTARTS   AGE
nginx-8f458dc5b-jvrbb   1/1     Running   0          68s

Expose this deployment with a service.

$ cat > svc.yaml <<EOF
apiVersion: v1
kind: Service
metadata:
  name: nginx
spec:
  selector:
    app: nginx
  ports:
  - port: 8080
    targetPort: 80
    name: web-server
EOF

$ kubectl create -f svc.yaml

$ kubectl get svc nginx
NAME    TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
nginx   ClusterIP   172.26.232.21   <none>        8080/TCP   106s

$ kubectl get ep nginx
NAME    ENDPOINTS         AGE
nginx   192.168.0.48:80   2m3s

Try to curl the service IP and see if it works.

$ curl 172.26.232.21:8080
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

Reset the cluster

Finally reset the cluster when you no longer need it.

bmctl reset cluster -c ${CLUSTER_ID}

Thus, we have launched a single node kubernetes cluster with Anthos on AWS, and tested it by running an nginx service.

Thanks for reading!!!

AWS CloudTrail log file validation

2022-08-29T00:00:00+00:00

This post first appeared on dev.to

Introduction

CloudTrail lets us log all API calls in our AWS cloud. In this post, we shall see how to create a CloudTrail, see where the logs are stored in S3, delete log, digest files and perform log file validation.

Create a CloudTrail

Search for CloudTrail on the AWS console and create a trail.

S3 bucket

A bucket should be automatically created and associated with the CloudTrail. A folder with the name CloudTrail should appear on the bucket where all the cloud trail logs should get saved.

Generate logs

Now let’s do an activity and see if it gets logged. Create a lambda function with name helloWorld and all other settings as default. You can do any other activity on AWS cloud as well, instead of creating a function.

We should see some files on S3 for this activity.

Delete log file

I am deleting one of the log files.

Log files are not suppose to be modified/deleted, as they can hold important auditing information, so now we need to find if our log files are modified or deleted(as in this case).

We try to validate now from the AWS CLI, it should say the digest file doesn’t exit.

$ aws cloudtrail validate-logs --trail-arn arn:aws:cloudtrail:ap-south-1:<accoount-id>:trail/management-events --start-time 2022-08-29
Validating log files for trail arn:aws:cloudtrail:ap-south-1:<account-id>:trail/management-events between 2022-08-29T00:00:00Z and 
2022-08-29T06:26:38Z

Results requested for 2022-08-29T00:00:00Z to 2022-08-29T06:26:38Z
No digests found  

This is because we have not enabled log file validation for the cloud trail.

Enable Log file validation

We can enable log file validation, by editing the cloud trail.

Digest

As the log file validation is enabled, we should see a new folder CloudTrail-Digest in S3.

And digest files should get added each hour.

Validate

As we enabled Log file validation, we can now check the integirty of the logs.

$ aws cloudtrail validate-logs --trail-arn arn:aws:cloudtrail:ap-south-1:<account-id>:trail/management-events --start-time 2022-08-29
Validating log files for trail arn:aws:cloudtrail:ap-south-1:<account-id>:trail/management-events between 2022-08-29T00:00:00Z and 
2022-08-29T07:00:20Z

Results requested for 2022-08-29T00:00:00Z to 2022-08-29T07:00:20Z
Results found for 2022-08-29T05:55:08Z to 2022-08-29T06:55:08Z:

1/1 digest files valid

Though we deleted a log file earlier, it shows the digest is valid, because we did not enable log file validation then.

After an hour, we should see two digest files.

The log file validation seems good for now.

$ aws cloudtrail validate-logs --trail-arn arn:aws:cloudtrail:ap-south-1:<account-id>:trail/management-events --start-time 2022-08-29
Validating log files for trail arn:aws:cloudtrail:ap-south-1:<account-id>:trail/management-events between 2022-08-29T00:00:00Z and 
2022-08-29T08:17:57Z

Results requested for 2022-08-29T00:00:00Z to 2022-08-29T08:17:57Z
Results found for 2022-08-29T05:55:08Z to 2022-08-29T07:55:08Z:

2/2 digest files valid
10/10 log files valid

Delete log file with validation check

We can try deleting log file that was created after enabling log file validation.

As expected the log file validations fails for one file. However the digests are still valid.

$ aws cloudtrail validate-logs --trail-arn arn:aws:cloudtrail:ap-south-1:<account-id>:trail/management-events --start-time 2022-08-29
Validating log files for trail arn:aws:cloudtrail:ap-south-1:<account-id>:trail/management-events between 2022-08-29T00:00:00Z and 
2022-08-29T08:22:42Z

Log file        
s3://aws-cloudtrail-logs-<account-id>-4a8dcb98/AWSLogs/<account-id>/CloudTrail/ap-south-1/2022/08/29/<account-id>_CloudTrail_ap-south-1_20220829T0755Z_7rDSVFC6Icgi9Z8V.json.gz     
INVALID: not found

Results requested for 2022-08-29T00:00:00Z to 2022-08-29T08:22:42Z
Results found for 2022-08-29T05:55:08Z to 2022-08-29T07:55:08Z:

2/2 digest files valid
9/10 log files valid, 1/10 log files INVALID

It also clearly says the validation failed because it can’t find a file that we deleted.

Delete digest

This time we can try deleting a digest file.

Hence digest validation should also fail.

$ aws cloudtrail validate-logs --trail-arn arn:aws:cloudtrail:ap-south-1:<account-id>:trail/management-events --start-time 2022-08-29
Validating log files for trail arn:aws:cloudtrail:ap-south-1:<account-id>:trail/management-events between 2022-08-29T00:00:00Z and 
2022-08-29T10:09:35Z

Digest file     
s3://aws-cloudtrail-logs-<account-id>-4a8dcb98/AWSLogs/<account-id>/CloudTrail-Digest/ap-south-1/2022/08/29/<account-id>_CloudTrail-Digest_ap-south-1_management-events_ap-south-1_20220829T085508Z.json.gz 
INVALID: not found

Log file        
s3://aws-cloudtrail-logs-<account-id>-4a8dcb98/AWSLogs/<account-id>/CloudTrail/ap-south-1/2022/08/29/<account-id>_CloudTrail_ap-south-1_20220829T0755Z_7rDSVFC6Icgi9Z8V.json.gz     
INVALID: not found

Results requested for 2022-08-29T00:00:00Z to 2022-08-29T10:09:35Z
Results found for 2022-08-29T05:55:08Z to 2022-08-29T09:55:08Z:

3/4 digest files valid, 1/4 digest files INVALID
20/21 log files valid, 1/21 log files INVALID

Note that we can enable versioning on S3 buckets to restore files.

Summary

So we saw how the log file validation feature in CloudTrail helps us find if there were any manual modifications to the log files or digest files. Thank you for reading !!!

AWS IoT pub/sub over MQTT

2022-07-17T00:00:00+00:00

This post first appeared on dev.to

Introduction

Hello, in this post we would create an IoT thing on AWS, use it’s credentials, to create two virtual clients on a Linux VM with python and test publishing from one client and subscribing from the other.

VM

Use your Linux machine or a VM as a virtual IoT device. We would be doing all of the CLI / coding tasks in the post, on this VM.

AWS

Install and setup the AWS CLI. Here is the region I have set as default.

$ cat ~/.aws/config                                                                                                                                                                   
[default]
region = ap-south-1

Endpoint

Goto ASW IoT > Settings on the cloud console, and get the Device data endpoint which is unique to the AWS account/region. Or get it from the AWS CLI.

$ IOT_DEV_EP=$(aws iot describe-endpoint --region ap-south-1 --output text --query endpointAddress)

$ echo $IOT_DEV_EP
<some-id>.iot.ap-south-1.amazonaws.com

Check connectivity to this endpoint from the Linux VM, which is your virtual IoT device.

$ ping -c 1 $IOT_DEV_EP
---TRUNCATED---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 196.145/196.145/196.145/0.000 ms

I have tested with 1 packet -c 1. You may send more than one though.

You can also check connectivity to the secure port for MQTT i.e. 8883 on the endpoint. Telnet should be present/installed on the machine though, for ex. sudo yum install telnet -y.

$ telnet $IOT_DEVICE_EP 8883
Trying <some-ip>...
Connected to <some-id>-ats.iot.ap-south-1.amazonaws.com.
Escape character is '^]'.

Thing

Goto AWS IoT > Manage > Things > Create Things on the cloud console and create a new thing with the name temp-sensor, set unnamed shadow(classic) and choose Auto-generate a new certificate (recommended).

In the policies section, create and select a new policy with the name temp-sensor and the following JSON.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "iot:Connect",
        "iot:Publish",
        "iot:Receive",
        "iot:RetainPublish",
        "iot:Subscribe"
      ],
      "Resource": "*"
    }
  ]
}

Download all the certificates/keys and name those as needed, I have named them as follows.

$ ls *.pem
ca-cert.pem  pub-cert.pem  pub-key.pem  pvt-key.pem

Note: If you are using a different host system like Windows with a browser, you can download these files, copy the content and then paste into the respective file on a Linux VM.

SDK

We would be using the AWS IoT SDK for Python.

# Clone the repository
git clone https://github.com/aws/aws-iot-device-sdk-python-v2.git

# Install using Pip
python3 -m pip install ./aws-iot-device-sdk-python-v2

# Remove  the clone, if it isn't required anymore
$ rm -rf aws-iot-device-sdk-python-v2

Connect

We have to first import the mqtt_connection_builder package from the awsiot sdk.

from awsiot import mqtt_connection_builder

We need the endpoint, the cerificate/key paths and a client_id to initiate a connection. We can generate a client_id using the uuid package.

from uuid import uuid4
client_id = 'client-' + str(uuid4())

We can then pass the files as arguments using the argparse package.

##### parse arguments
import argparse

parser = argparse.ArgumentParser(description="Send and receive messages through and MQTT connection.")

parser.add_argument('--ep', help="IoT device endpoint <some-prefix>.iot.<region>.amazonaws.com", required=True, type=str)
parser.add_argument('--pubcert', help="IoT device public certificate file path", required=True, type=str)
parser.add_argument('--pvtkey', help="IoT device private key file path", required=True, type=str)
parser.add_argument('--cacert', help="IoT device CA cert file path", required=True, type=str)
parser.add_argument('--topic', help="Topic name", required=True, type=str)

args = parser.parse_args()

You can also skip the parse arguments step and add the parameters directly.

We have the necessary parameters to initiate the connection.

mqtt_connection = mqtt_connection_builder.mtls_from_path(
    endpoint=args.ep,
    cert_filepath=args.pubcert,
    pri_key_filepath=args.pvtkey,
    ca_filepath=args.cacert,
    client_id=client_id
)

connect_future = mqtt_connection.connect()

# result() waits until a result is available
connect_future.result()
print(f'{client_id} is connected!')

Put the code we saw in the connect section so far in a file called connect.py and run the following.

 $ python connect.py --ep $IOT_DEV_EP --pubcert pub-cert.pem --pvtkey pvt-key.pem --cacert ca-cert.pem --topic temperature                               
client-3924e5d4-97d3-43e6-b214-169d008b2d02 is connected!

Great, the connection is successful.

Publish

Before publishing, let’s import certain variables from the previous connect code we wrote.

# import vars from connect.py
from connect import args, client_id, mqtt_connection

We shall publish a message from our client that contains the client-id, temperature and current time. We already have the client_id with us.

We can use the datetime library for getting the timestamp.

# set timestamp
from datetime import datetime
now = datetime.now()

And we can generate a random number for the temperature.

# set temperature
import random
temp = random.randrange(10, 40)

So our message now looks like:

# form the message
message = f'id: {client_id}, temp: {temp}, time: {now}'

Time to publish it with the publish method.

# publish the  message
from awscrt import mqtt
import json
mqtt_connection.publish(
    topic=args.topic,
    payload= json.dumps(message),
    qos=mqtt.QoS.AT_LEAST_ONCE
)
print('Message published')

Note that awscrt is the AWS common runtime library we are using to set the QoS.

Put this code in a separate file with name publisher.py and run it.

 $ python publisher.py --ep $IOT_DEV_EP --pubcert pub-cert.pem --pvtkey pvt-key.pem --cacert ca-cert.pem --topic temperature
client-cb3f69b6-b53b-42a4-973f-63abe39f2c4f is connected!
Message published

So far we published only one message, I would be modifying the code so that it continuously sends one message per second until interrupted with Ctrl C.

$ cat publisher.py 
# import vars from connect.py
from connect import args, client_id, mqtt_connection

from awscrt import mqtt
from datetime import datetime
import json, random, time

while True:
    # set timestamp
    now = datetime.now()
    
    # set temperature
    temp = random.randrange(10, 40)
    
    # form the message
    message = f'id: {client_id}, temp: {temp}, time: {now}'
    
    # publish the  message
    mqtt_connection.publish(
        topic=args.topic,
        payload= json.dumps(message),
        qos=mqtt.QoS.AT_LEAST_ONCE
    )
    print(f'Message published: {message}')
    time.sleep(1)

Run the code again.

 $ python publisher.py --ep $IOT_DEV_EP --pubcert pub-cert.pem --pvtkey pvt-key.pem --cacert ca-cert.pem --topic temperature
client-1102832d-a0c0-481c-b1f4-5b363f9c0890 is connected!
Message published: id: client-1102832d-a0c0-481c-b1f4-5b363f9c0890, temp: 14, time: 2022-07-17 09:20:44.652955
Message published: id: client-1102832d-a0c0-481c-b1f4-5b363f9c0890, temp: 29, time: 2022-07-17 09:20:45.654102
Message published: id: client-1102832d-a0c0-481c-b1f4-5b363f9c0890, temp: 35, time: 2022-07-17 09:20:46.655002

Publishing looks good, let’s go to the subscriber.

Subscriber

Firts, import certain vars from the connect module, similar to what we did in publisher.

# import vars from connect.py
from connect import args, mqtt_connection

Define a callback function that triggers when a message is received on the topic.

# call back to trigger when a message is received
def on_message_received(topic, payload, dup, qos, retain, **kwargs):
    print("Received message from topic '{}': {}".format(topic, payload))

Subscribe to the topic.

##### subscribe to topic
from awscrt import mqtt
subscribe_future, packet_id = mqtt_connection.subscribe(
    topic=args.topic,
    qos=mqtt.QoS.AT_LEAST_ONCE,
    callback=on_message_received
)

# result() waits until a result is available
subscribe_result = subscribe_future.result()
print(f'Subscribed to {args.topic}')

We need to the keep the program open, so that we can read the messages, as defined in the callback function. For this, we can use the threading module.

import threading
threading.Event().wait()

Keep this code in a file named subscriber.py.

Time to run the subscriber code while the publisher code is also running.

Test on console

You can also test if the publish/subscribe operations are working correctly via the handy MQQT test client on AWS cloud. So if you are publishig from the code, you can test it at the subscriber window.

And likewise if you are subscribing on the code, you can publish a test message from the MQTT test client.

$ python3 subscriber.py --ep $IOT_DEV_EP --pubcert pub-cert.pem --pvtkey pvt-key.pem --cacert ca-cert.pem --topic temperature
client-a17093b1-108e-4f3c-a65c-ea38900f2153 is connected!
Subscribed to temperature
Received message from topic 'temperature': b'{\n  "message": "Hello from AWS IoT console"\n}'

With this the post is complete ;), thank you for reading !!!. For other code examples provided by the AWS team, please checkout this github link

Setup IoT core on Google cloud with Terraform

2022-07-11T00:00:00+00:00

This post first appeared on dev.to

Hello :wave:, we shall see how to provision a minimal IoT infrastructure on Google cloud with Terraform.

I shall be doing this straight on the Google cloud shell…

Project

Set your gcloud config…

Get you projects list and set one of the projects as the current project.

$ gcloud projects list

$ gcloud config set project <project-id>

Directories

Let’s create two directories for the terraform resources, one for the service account and another for rest of the resources.

$ mkdir ~/sa
$ mkdir ~/iot

and one more hidden directory for storing the keys/certificates.

$ mkdir ~/.auth

TF Provider

We would set the Terraform provider configuration here.

Get the list of zones in the specific region. Note that cloud IoT is currently supported in these regions: asia-east1, europe-west1, us-central1.

$ gcloud compute zones list --filter="region~asia-east1" | grep -i name
NAME: asia-east1-b
NAME: asia-east1-a
NAME: asia-east1-c

I would be using zone c.

Set the provider details in terraform with the available information.

$ cat ~/sa/main.tf 
provider "google" {
    project = "<project-id>"
    region = "asia-east1"
    zone = "asia-east1-c"
}

$ cp ~/sa/main.tf ~/iot/main.tf

Service account

We are going to create a service account from our user account, which could be further used for creating other resources using terraform.

$ ls ~/sa
main.tf  outputs.tf  sa.tf

$ cat ~/sa/sa.tf
# https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/google_service_account

resource "google_service_account" "iot_sa" {
  account_id   = "iot-sa"
  display_name = "IoT Service Account"
}

# note this requires the terraform to be run regularly
resource "time_rotating" "iot_sa_key_rotation" {
  rotation_days = 30
}

resource "google_service_account_key" "iot_sa_key" {
  service_account_id = google_service_account.iot_sa.name

  keepers = {
    rotation_time = time_rotating.iot_sa_key_rotation.rotation_rfc3339
  }
}

resource "google_project_iam_member" "iot_editor" {
  project = var.project_id
  role    = "roles/cloudiot.editor"
  member  = "serviceAccount:${google_service_account.iot_sa.email}"

  condition {
    title       = "expires_after_2022_07_31"
    description = "Expiring at midnight of 2022-07-31"
    expression  = "request.time < timestamp(\"2022-08-01T00:00:00Z\")"
  }
}

resource "google_project_iam_member" "pub_sub_editor" {
  project = var.project_id
  role    = "roles/pubsub.editor"
  member  = "serviceAccount:${google_service_account.iot_sa.email}"

  condition {
    title       = "expires_after_2022_07_31"
    description = "Expiring at midnight of 2022-07-31"
    expression  = "request.time < timestamp(\"2022-08-01T00:00:00Z\")"
  }
}

$ cat ~/sa/variables.tf
variable "project_id" {
  type = string
  default = "<project-id>"
}

$ cat ~/sa/outputs.tf
output "iot_sa_private_key" {
  description = "Private key of the IoT service account"
  value       = google_service_account_key.iot_sa_key.private_key
  sensitive = true
}

So we are creating a service account with editor roles on IoT core & Pub/Sub, a key for the service account with rotation, and then we would output the private key to save it locally for future use.

API

We have to enable the Cloud IoT API, you can get the fqdn of it using $ gcloud services list --available --filter="name~.*iot.*". Let’s add the terraform configuration which can enable it.

$ cat ~/sa/variables.tf
variable "project_id" {
  type = string
  default = "<project-id>"
}

$ cat ~/sa/apis.tf
resource "google_project_service" "cloudiot" {
  project = var.project_id
  service = "cloudiot.googleapis.com"

  timeouts {
    create = "30m"
    update = "40m"
  }

  disable_dependent_services = true
}

Apply

We can now create the service account, it’s associated resources, and enable the Cloud IoT API.

$ cd ~/sa
$ terraform init

# optional, to know what will be  changed
$ terraform plan

$ terraform apply --auto-approve

Validate

Validate the service account creation, via the console.

And the roles attached to it.

Key

The private key of the service account could be retrieved from the terraform output.

$ terraform output -raw iot_sa_private_key | base64 -d > ~/.auth/iot_sa_private_key.json

We have saved the base64 decoded private key in a hidden auth directory at home.

Credentials

We could now start using the service principal’s private key as a credential for rest of our Terraform activities, for which we have to set an environment variable.

$ export GOOGLE_APPLICATION_CREDENTIALS=~/.auth/iot_sa_private_key.json

Note: to remove the credential anytime, jus run unset GOOGLE_APPLICATION_CREDENTIALS

Certificate

The connection between the IoT devices and Google IoT core would be secure over TLS, hence a certificate should be generated for our virtual device.

$ openssl req -x509 -newkey rsa:2048 -keyout ~/.auth/rsa_private.pem -nodes -out ~/.auth/rsa_cert.pem -subj "/CN=unused"

$ ls ~/.auth/ | grep pem
rsa_cert.pem
rsa_private.pem

The private key is in rsa_private.pem and the public certificate is in rsa_cert.pem.

We would keep the private key locally and refer to it while generating a client connection from our device(we are not dealing with the client side of things in this post though), where as the public certificate would be attached to the remote side, in this case, the IoT core.

Registry

Add the terrafaorm configuration for the device registry, pub/sub topics it would use.

$ cd ~/iot

$ cat registry.tf
# https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/cloudiot_registry

resource "google_cloudiot_registry" "iot-registry" {
  name     = "iot-registry"

  event_notification_configs {
    pubsub_topic_name = google_pubsub_topic.additional-telemetry.id
    subfolder_matches = "test/path"
  }

  event_notification_configs {
    pubsub_topic_name = google_pubsub_topic.default-telemetry.id
    subfolder_matches = ""
  }

  state_notification_config = {
    pubsub_topic_name = google_pubsub_topic.default-devicestatus.id
  }

  mqtt_config = {
    mqtt_enabled_state = "MQTT_ENABLED"
  }

  http_config = {
    http_enabled_state = "HTTP_ENABLED"
  }

  log_level = "INFO"
}

We would be using 3 topics, all messages published by the client to the path /devices/DEVICE_ID/events would go to the default telemetry topic, and all messages for /devices/DEVICE_ID/state would go to the default device state topic. We have one additional topic with sub folder path “test/path” which means the messages published to /devices/DEVICE_ID/events/test/path would land there.

Pub/Sub

A separate file for creating the pub/sub topics which will be linked to the registry.

resource "google_pubsub_topic" "default-devicestatus" {
  name = "default-devicestatus"
}

resource "google_pubsub_topic" "default-telemetry" {
  name = "default-telemetry"
}

resource "google_pubsub_topic" "additional-telemetry" {
  name = "additional-telemetry"
}

Devices

We would be creating two devices, a basic one which should bind with the gateway, and an advanced device that could be standalone with out a gateway.

The authentication for the basic device will be handled by the gateway and hence, we don’t have to set any credentials for the basic device.

$ cat basic-device.tf
# https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/cloudiot_device

resource "google_cloudiot_device" "basic-device" {
  name     = "basic-device"
  registry = google_cloudiot_registry.iot-registry.id  
}

$ cat advanced-device.tf
# https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/cloudiot_device

resource "google_cloudiot_device" "advanced-device" {
  name     = "advanced-device"
  registry = google_cloudiot_registry.iot-registry.id

  credentials {
    public_key {
        format = "RSA_X509_PEM"
        key = file("~/.auth/rsa_cert.pem")
    }
  }
}

Gateway

And now the gateway.

# https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/cloudiot_device

resource "google_cloudiot_device" "iot-gateway" {
  name     = "iot-gateway"
  registry = google_cloudiot_registry.iot-registry.id 

  credentials {
    public_key {
        format = "RSA_X509_PEM"
        key = file("~/.auth/rsa_cert.pem")
    }
  }

  gateway_config {
    gateway_type = "GATEWAY"
    gateway_auth_method = "ASSOCIATION_ONLY"
  }
}

I have setup ASSOCIATION_ONLY as the auth method, which means the device I will bind to this gateway would rely on this gateway for authentication and woudln’t authenticate with its own credential.

Apply

The resources can be created.

$ terraform init

# optional, to see what will change
$ terraform plan

$ terraform apply

Bind

The basic device should be bounded with the gateway, so that the gateway generate JWTs on behalf of the device.

$ gcloud iot devices gateways bind --gateway iot-gateway --gateway-region asia-east1 --gateway-registry iot-registry --device basic-device --device-region asia-east1

I used gcloud for binding the device with the gateway as I was not able to quite find it in the terraform registry.

Validate

Finally, check the resources on the console.

Registry

Devices

Gateway

Device binding

Seems all good :)

Graph

Let’s look at the graph that terraform can generate. We can view it on the cloud shell editor itself.

$ terraform graph | dot -Tsvg > graph.svg

$ ls *.svg
graph.svg

With this the post is complete, thanks for reading !!!

networkandcode

Query HarperDB’s REST API via Apollo GraphQL

Introduction

Studio

Clone

Dependencies

Variable

Start

Apollo studio

Token

Code

Schema

Datasource

Index

Resolvers

Operations

Create schema

Create table

Add entries to table

Retrieve records

Retrieve selective records

Summary

HarperDB Helm chart on Artifact Hub

Introduction

Search

Chart

Values

Image

Service

Security context

Resources

Secret

PVC

Deployment

Lint

Kubeconfig

Dry run

Package

Repo

Add repository

Search again

Install

Validate

API

Clean up

Summary

HarperDB with Anthos on GKE

Introduction

Enable Anthos

GKE

Register

Kubeconfig

Namespace

Helm

Resources

Pod

API test

Clean up

Summary

HarperDB with Helm on EKS

HarperDB on EKS

Cluster

Docker hub

Kubernetes manifests

Workloads

Fix PVC

Volume permission fix

API call

Persistence

Clean up

GCP Anthos Cluster on AWS

AWS cloud shell

EC2

SSH

Install gcloud

Login to gcloud

Authenticaion

APIs

Other tools

VxLAN