<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[Rohan Chakravarthy]]></title><description><![CDATA[Rohan Chakravarthy]]></description><link>https://rohanc.me/</link><image><url>https://rohanc.me/favicon.png</url><title>Rohan Chakravarthy</title><link>https://rohanc.me/</link></image><generator>Ghost 5.2</generator><lastBuildDate>Mon, 01 Sep 2025 07:04:47 GMT</lastBuildDate><atom:link href="https://rohanc.me/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[Threat Modeling (for beginners)]]></title><description><![CDATA[You need to identify threats before you can secure your application. This post covers the fundamentals of threat modeling and how to incorporate it into your existing software development lifecycle]]></description><link>https://rohanc.me/threat-modeling-beginners/</link><guid isPermaLink="false">62a822d764859a000170fee6</guid><category><![CDATA[security]]></category><dc:creator><![CDATA[Rohan Chakravarthy]]></dc:creator><pubDate>Fri, 23 Sep 2022 07:11:30 GMT</pubDate><media:content url="https://rohanc.me/content/images/2022/09/tis_but_a_scratch-3.gif" medium="image"/><content:encoded><![CDATA[<img src="https://rohanc.me/content/images/2022/09/tis_but_a_scratch-3.gif" alt="Threat Modeling (for beginners)"><p>A threat modeling exercise is an important step while designing secure applications. This was an extremely daunting process when I first started working on consumer-facing products. There is a lot of information available and no clear starting point. This post aims to provide that starting point.</p><p>As one of the first engineers on Amazon Care&apos;s core infra/security team, I worked closely with Amazon&apos;s Application Security (AppSec) org. I built critical authorization systems and libraries used across the organization. And as you might imagine, this required extensive security reviews and a LOT of threat models along the way.</p><p>This post will help you understand threat modeling fundamentals and how to incorporate it into your existing software development lifecycle. It is by no means a comprehensive guide, but you should get pretty far if you think through all the questions and suggestions below. I&apos;ll include a detailed worksheet in my next post.</p><p>If you&apos;d like a deeper dive, <a href="https://shostack.org/books/threat-modeling-book">Threat Modeling by Adam Schostack</a> is one of my favorite resources on this topic. </p><!--kg-card-begin: markdown--><h4 id="table-of-contents">Table of Contents</h4>
<ul>
<li><a href="#structuring-threats">Structuring Threats</a></li>
<li><a href="#trust-boundaries">Defining Trust Boundaries</a></li>
<li><a href="#two-phases-start-top-down-end-bottom-up">Two Phases: Start top-down, end bottom-up</a></li>
<li><a href="#defense-in-depth">Defense in Depth</a></li>
<li><a href="#you-are-more-qualified-than-you-think">You are more qualified than you think</a></li>
</ul>
<!--kg-card-end: markdown--><h2 id="structuring-threats">Structuring Threats</h2><p>Let&apos;s start with a template for documenting threats. I find it helpful to create a table with the columns defined below. Every identified threat is a single row in this table.</p><p>Let&apos;s use a simple example to understand each of these columns - An attacker attempting to exfiltrate customer data through an administrative API.</p><!--kg-card-begin: html--><div style="overflow: scroll;">
<table style="white-space:no-wrap; width: 100%">
  <tr>
    <th></th>
    <th>Description</th>
    <th>Example</th>
  </tr>
  <tr>
    <th>Attacker Goal</th>
      <td>What is the attacker attempting to do?</td><td>Exfiltrating all our customer profiles</td>
  </tr>
  <tr>
    <th>Threat Description / Attack runbook</th>
    <td>How will the attacker accomplish their goal?</td>
    <td>Steal an administrative user&apos;s credentials and query the admin APIs</td>
  </tr>
    <tr>
    <th>Business Impact</th>
    <td>What is the business impact of the attacker achieving their goal?</td><td>Loss of customer trust, brand risk</td>
  </tr>
    <tr>
    <th>Risk Category</th>
    <td>What category of risk is this?</td><td>Information Disclosure (I&apos;ll cover specific risk categories in my next post)</td>
  </tr>
    <tr>
    <th>Mitigation(s)</th>
    <td>How does your application mitigate this attack?</td><td>IdP with hardware MFA, short-lived credentials, limited access to admin APIs, rate limiting</td>
  </tr>
    <tr>
    <th>Verification strategy for mitigation(s)</th>
    <td>How will you validate that your mitigations work?</td><td>Manual tests, automated tests</td>
  </tr>
    <tr>
    <th>Incident Discovery Mechanism</th>
    <td>How will you know if this attack was successful despite your mitigations?</td><td>Alarms and anomaly detection on sensitive APIs</td>
  </tr>
<tr>
    <th>Incident Response Plan</th>
    <td>How will you respond to the incident?</td><td>Revoke all tokens. Look at logs to identify which records were exfiltrated. Work with Legal and Compliance teams to identify next steps</td>
  </tr>
</table>
</div><!--kg-card-end: html--><h2 id="trust-boundaries">Trust Boundaries</h2><p>A trust boundary is a logical demarcation in your system beyond which all principals (users, applications, systems) require additional checks. Everything inside the boundary has the same trust level. Everything outside the boundary requires additional (or different) checks. Common examples include VPNs, Virtual Private Clouds (VPCs), and even AWS accounts. There can be many layers of trust boundaries. For example, each of the boxes below could be a trust boundary:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://rohanc.me/content/images/2022/06/trust_boundary.drawio-2-.png" class="kg-image" alt="Threat Modeling (for beginners)" loading="lazy" width="1740" height="1084" srcset="https://rohanc.me/content/images/size/w600/2022/06/trust_boundary.drawio-2-.png 600w, https://rohanc.me/content/images/size/w1000/2022/06/trust_boundary.drawio-2-.png 1000w, https://rohanc.me/content/images/size/w1600/2022/06/trust_boundary.drawio-2-.png 1600w, https://rohanc.me/content/images/2022/06/trust_boundary.drawio-2-.png 1740w" sizes="(min-width: 720px) 720px"><figcaption>Trust Boundaries</figcaption></figure><p>Notice how there is overlap. The same &quot;box&quot; could be part of multiple trust boundaries, and you can reason about them separately. Here are some potential trust boundaries in the diagram above:</p><!--kg-card-begin: markdown--><table>
<thead>
<tr>
<th>Trust Boundary</th>
<th>Potential Trust policies</th>
</tr>
</thead>
<tbody>
<tr>
<td>AWS Account</td>
<td>Each AWS Account could have console access protected via IAM users or AWS SSO</td>
</tr>
<tr>
<td>Subnet</td>
<td>Each Subnet could have its own security group to limit outside access, but resources within it can talk to each other</td>
</tr>
<tr>
<td>VPC</td>
<td>The VPC might have network ACL rules to control outbound internet access for all subnets but might allow all Subnet &lt;-&gt; Subnet communication</td>
</tr>
<tr>
<td>AWS organization</td>
<td>Resource policies could limit resource access to accounts in the org</td>
</tr>
</tbody>
</table>
<!--kg-card-end: markdown--><p>Define trust boundaries early. If you have an idea of the checks required for a request, data or user to cross each trust boundary early on, it greatly increases your chances of building a secure system. As a result, my recommendation is to define these trust boundaries during the system architecture/design phase.</p><h2 id="two-phases-start-top-down-end-bottom-up">Two Phases: Start top-down, end bottom-up</h2><p>I recommend going through the threat modeling exercise twice, especially for larger systems. </p><figure class="kg-card kg-image-card"><img src="https://rohanc.me/content/images/2022/09/defense_in_depth-1-.png" class="kg-image" alt="Threat Modeling (for beginners)" loading="lazy" width="1834" height="744" srcset="https://rohanc.me/content/images/size/w600/2022/09/defense_in_depth-1-.png 600w, https://rohanc.me/content/images/size/w1000/2022/09/defense_in_depth-1-.png 1000w, https://rohanc.me/content/images/size/w1600/2022/09/defense_in_depth-1-.png 1600w, https://rohanc.me/content/images/2022/09/defense_in_depth-1-.png 1834w" sizes="(min-width: 720px) 720px"></figure><p>The first pass is a broader, top-down approach. From a timeline perspective, this is around the time you will be finalizing your system architecture. Think about threats to your service overall, ignoring low-level component details. This helps to identify major threats introduced by your approach before you spend valuable development cycles. It is a lot cheaper to re-architect a system <em><u>before</u></em> building it.</p><p>The second pass is a more specific, bottom-up approach. Think about threats introduced by specific components you are using: Are there documented anti-patterns for your components? Are you using trusted sources for open-source libraries? Does the SaaS service you are using have robust auth mechanisms in place to protect your customer data? </p><p>If your organization invests in low-level design documents ( documents with details about specific components you will use - libraries, cloud resources, endpoints, auth mechanisms), you should complete your second pass after writing those documents. Otherwise, perform the second pass towards the end of your development lifecycle.</p><h2 id="defense-in-depth">Defense in Depth</h2><p>Defense in depth (or the <a href="https://en.wikipedia.org/wiki/Swiss_cheese_model">&quot;Swiss Cheese&quot; model</a>) is the strategy of implementing multiple (and sometimes redundant) layers of protection.</p><figure class="kg-card kg-image-card"><img src="https://rohanc.me/content/images/2022/09/Swiss-Cheese-model.jpg" class="kg-image" alt="Threat Modeling (for beginners)" loading="lazy" width="1024" height="597" srcset="https://rohanc.me/content/images/size/w600/2022/09/Swiss-Cheese-model.jpg 600w, https://rohanc.me/content/images/size/w1000/2022/09/Swiss-Cheese-model.jpg 1000w, https://rohanc.me/content/images/2022/09/Swiss-Cheese-model.jpg 1024w" sizes="(min-width: 720px) 720px"></figure><p>It borrows its name from a military strategy, and has the same goal - use preventative measures to slow down an attack and give yourself time to detect and react to it. In addition to slowing down (or discouraging) attackers, it also ensures your systems don&apos;t have a single point of failure. Ideally, a single bad commit or config change shouldn&apos;t bring down your application or open it up to malicious actors. </p><figure class="kg-card kg-image-card"><img src="https://rohanc.me/content/images/2022/09/tis_but_a_scratch.gif" class="kg-image" alt="Threat Modeling (for beginners)" loading="lazy" width="268" height="250"></figure><p>Utilize this strategy while defining mitigations for the threats you have identified. For example, a public-facing endpoint can add multiple layers of security, including:</p><ol><li>Use Web Application Firewall (WAF) rules to block known bot IPs</li><li>Add rate limiting rules</li><li>Require a valid auth token</li><li>Validate the caller has access to the resources they are retrieving</li></ol><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://rohanc.me/content/images/2022/09/defense_in_depth.png" class="kg-image" alt="Threat Modeling (for beginners)" loading="lazy" width="1394" height="970" srcset="https://rohanc.me/content/images/size/w600/2022/09/defense_in_depth.png 600w, https://rohanc.me/content/images/size/w1000/2022/09/defense_in_depth.png 1000w, https://rohanc.me/content/images/2022/09/defense_in_depth.png 1394w" sizes="(min-width: 720px) 720px"><figcaption>Defense in Depth for an API</figcaption></figure><h2 id="threat-identification-model">Threat Identification Model</h2><p>There are <a href="https://en.wikipedia.org/wiki/Threat_model">many models</a> you can use to identify your threats. I find <a href="https://en.wikipedia.org/wiki/STRIDE_(security)">STRIDE</a> to be thorough and easy to reason about for applications deployed in a cloud environment, but feel free to use another model - they are all just structured ways to identify threats.</p><h2 id="you-are-more-qualified-than-you-think">You are more qualified than you think</h2><p>Threat modeling identifies the security risks introduced by new versions of your application. It requires an intricate knowledge of your architecture, system design and application logic. This actually makes <em><u>you</u></em> one of the most qualified people to build the threat model for your application. &#xA0;You understand it better than any outside party!</p><p>Start with the guidance in this post, but also use your familiarity with the systems to identify creative ways in which attackers could compromise your application.</p><hr><p>I&apos;ll publish a post soon with a detailed STRIDE worksheet to identify threats. As always, you can reach me <a href="https://twitter.com/rohchak/">@rohchak</a></p>]]></content:encoded></item><item><title><![CDATA[AWS VPC Subnet Groups]]></title><description><![CDATA[The L2 VPC cdk construct accepts a list of Subnet Groups. This post explains how subnet groups work, and the reason for some defaults.]]></description><link>https://rohanc.me/aws-cdk-vpc-subnet-groups/</link><guid isPermaLink="false">62a6ac3d64859a000170fdf3</guid><category><![CDATA[vpc]]></category><category><![CDATA[AWS]]></category><dc:creator><![CDATA[Rohan Chakravarthy]]></dc:creator><pubDate>Mon, 13 Jun 2022 03:37:07 GMT</pubDate><media:content url="https://rohanc.me/content/images/2022/06/vpc_feature2.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://rohanc.me/content/images/2022/06/vpc_feature2.jpg" alt="AWS VPC Subnet Groups"><p>The <a href="https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_ec2.Vpc.html">L2 VPC cdk construct</a> accepts a list of Subnet Groups in the <code>subnetConfiguration</code> property. Subnet Groups only seem to be documented in the context of an <a href="https://docs.aws.amazon.com/AmazonElastiCache/latest/mem-ug/SubnetGroups.html">Elasticache cluster</a>, so I&apos;ll provide a quick breakdown of how they work in the context of VPCs:</p><ol><li><a href="#subnets-are-assigned-cidr-blocks-in-the-order-they-are-defined">Subnets are assigned CIDR blocks in the order they are defined</a></li><li><a href="#subnet-groups-are-deployed-to-every-az">Subnet groups are deployed to every AZ</a></li><li><a href="#reserved-subnet-blocks">You can &quot;Reserve&quot; subnet blocks without deploying a subnet resource</a></li></ol><p>For reference, a Subnet Group (configured using a <a href="https://docs.aws.amazon.com/cdk/api/v1/python/aws_cdk.aws_ec2/SubnetConfiguration.html">SubnetConfiguration</a> instance) has the following configurable properties:</p><!--kg-card-begin: html--><table>
<thead>
<tr>
<th>Property</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>name</td>
<td>string</td>
</tr>
<tr>
<td>subnet_type</td>
    <td>valid values: <code>PRIVATE_ISOLATED</code>, <code>PRIVATE_WITH_NAT</code>, <code>PUBLIC</code></td>
</tr>
<tr>
<td>cidr_mask</td>
    <td>valid values: <code>16-28</code></td>
</tr>
<tr>
<td>map_public_ip_on_launch</td>
    <td><code>true</code> by default for public subnets</td>
</tr>
<tr>
<td>reserved</td>
    <td><code>false</code> by default. more details below</td>
</tr>
</tbody>
</table><!--kg-card-end: html--><h3 id="subnets-are-assigned-cidr-blocks-in-the-order-they-are-defined">Subnets are assigned CIDR blocks in the order they are defined</h3><p>Every <a href="https://docs.aws.amazon.com/cdk/api/v1/python/aws_cdk.aws_ec2/SubnetConfiguration.html">Subnet Group entry</a> in the <code>subnetConfiguration</code> list is assigned a CIDR block based on the VPC <code>cidr</code> property and the subnet group <code>cidrMask</code> property. </p><p>Let&apos;s say you have a VPC (<code>cidr:10.0.0.0/16</code>) with a <strong><u>single AZ</u></strong> with 3 entries (<code>cidrMask:24</code>) in the <code>subnetConfiguration</code> list:</p><pre><code class="language-typescript">const vpc = new Vpc(this, &apos;lambda-vpc&apos;, {
    &apos;cidr&apos;: &quot;10.0.0.0/16&quot;,
    &apos;maxAzs&apos;: 1,
    &apos;subnetConfiguration&apos;: [{
        cidrMask: 24,
        name: &apos;the-shy-one&apos;,
        subnetType: SubnetType.PRIVATE_ISOLATED,
    },
    {
        cidrMask: 24,
        name: &apos;the-cute-one&apos;,
        subnetType: SubnetType.PRIVATE_WITH_NAT
    },
    {
        cidrMask: 24,
        name: &apos;the-rebel&apos;,
        subnetType: SubnetType.PUBLIC
    }
    ],
    &apos;vpcName&apos;: &apos;generic-boy-band&apos;
})</code></pre><p>This VPC will have 3 subnets with the following blocks: <code>10.0.0.0/24</code>, <code>10.0.0.1/24</code> and <code>10.0.0.2/24</code>. </p><p>If the VPC has 2 AZs instead, there will be 2 blocks per subnet group - <code>10.0.0.0/24</code> and <code>10.0.0.1/24</code> for the first subnet group, <code>10.0.0.2/24</code> and <code>10.0.0.3/24</code> for the second subnet group and so on. </p><p>Trying to add a new subnet group entry in the middle of the configuration list after the initial deployment will not work. For example, if we modify the example above:</p><figure class="kg-card kg-code-card"><pre><code class="language-typescript">const vpc = new Vpc(this, &apos;lambda-vpc&apos;, {
    &apos;cidr&apos;: &quot;10.0.0.0/16&quot;,
    &apos;maxAzs&apos;: 1,
    &apos;subnetConfiguration&apos;: [{
        cidrMask: 24,
        name: &apos;the-shy-one&apos;,
        subnetType: SubnetType.PRIVATE_ISOLATED,
    },
    {
        cidrMask: 24,
        name: &apos;the-cute-one&apos;,
        subnetType: SubnetType.PRIVATE_WITH_NAT
    },
    // NEW ENTRY
    {
        cidrMask: 24,
        name: &apos;the-copy-cat&apos;,
        subnetType: SubnetType.PRIVATE_WITH_NAT
    },
    {
        cidrMask: 24,
        name: &apos;the-rebel&apos;,
        subnetType: SubnetType.PUBLIC
    }
    ],
    &apos;vpcName&apos;: &apos;generic-boy-band&apos;
})</code></pre><figcaption>updated VPC config</figcaption></figure><p>It <strong><em>will fail </em></strong>with the error:</p><div class="kg-card kg-callout-card kg-callout-card-grey"><div class="kg-callout-emoji">&#x2757;</div><div class="kg-callout-text">Resource handler returned message: &quot;The CIDR [..] conflicts with another subnet&quot;</div></div><h3 id="subnet-groups-are-deployed-to-every-az">Subnet groups are deployed to every AZ</h3><p>Every <a href="https://docs.aws.amazon.com/cdk/api/v1/python/aws_cdk.aws_ec2/SubnetConfiguration.html">Subnet Group entry</a> in the <code>subnetConfiguration</code> list creates a subnet <em>per AZ </em>in the VPC<em>. </em>There is no way to specify different AZs for different subnet groups, nor can you limit a subnet to a single AZ. For example, say your VPC is configured with 2 AZs. You can&apos;t have <code>SUBNET-GROUP-1</code> subnets deployed in <code>us-west-2a</code> and <code>us-west-2b</code>, and <code>SUBNET-GROUP-2</code> deployed in <code>us-west-2b</code> only, at least with the L2 VPC construct. </p><p>This seemed odd to me, so I posted <a href="https://www.reddit.com/r/aws/comments/v89xms/is_it_possible_to_provide_different_subnet/">a question about it on the AWS subreddit</a>. See the discussion in that thread for more details, but here&apos;s why I am convinced this is a good default: </p><p>AZ-aware AWS resources allow specifying the number of AZs to deploy resources into. For example, you can configure an EC2 auto-scaling group to only deploy 2 instances, even if you have 3 AZs available. If I&apos;m creating a VPC with multiple AZs, it means I anticipate needing higher availability guarantees, even if I don&apos;t need it immediately for all my applications. Thus, the default subnet group configuration ensures I have that reserved address space whenever I choose to start using the additional AZs (and it does not cost anything).</p><p>This also ensures <em>all subnets in a subnet group form one contiguous block of CIDR address ranges</em>. This simplifies rules for similar subnets. <a href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/security-group-rules-reference.html">This page</a> has some examples that show how contiguous blocks can be useful.</p><h3 id="reserved-subnet-blocks">&quot;Reserved&quot; subnet blocks</h3><p>Subnet Group configurations also provide a <code>reserved</code> boolean property. Read a detailed description <a href="https://github.com/aws/aws-cdk/issues/2087">here</a>, but this property essentially allows you to block certain CIDR blocks without actually creating the subnet resource. </p><p><strong><u>Example:</u></strong> I have a VPC configured with a single AZ. There are 2 kinds of subnets in this VPC, &quot;application&quot; and &quot;database&quot; subnets. All resources in the &quot;application&quot; subnet will have similar access, which will differ from the access granted to resources in the &quot;database&quot; subnet. I expect to eventually need 5 &quot;application&quot; subnets and 2 &quot;database&quot; subnets, but I only need 2 &quot;application&quot; subnets and a single &quot;database&quot; subnet today.</p><p><em>Note: Remember CIDR blocks are allocated <a href="#Subnets-are-assigned-CIDR-blocks-in-the-order-they-are-defined">in the order you define subnet groups</a>, and you cannot add new groups in the middle of the configuration list.</em></p><p><u><strong>Option 1:</strong></u> Define 3 subnet groups corresponding to the 3 subnets I need today. Add new subnet group entries as needed.</p><pre><code class="language-typescript">const vpc = new Vpc(this, &apos;lambda-vpc&apos;, {
    &apos;cidr&apos;: &quot;10.0.0.0/16&quot;,
    &apos;maxAzs&apos;: 1,
    &apos;subnetConfiguration&apos;: [{
        cidrMask: 24,
        name: &apos;application-1&apos;,
        subnetType: SubnetType.PRIVATE_WITH_NAT,
    },
    {
        cidrMask: 24,
        name: &apos;application-2&apos;,
        subnetType: SubnetType.PRIVATE_WITH_NAT
    },
    {
        cidrMask: 24,
        name: &apos;database-1&apos;,
        subnetType: SubnetType.PRIVATE
    }],
    &apos;vpcName&apos;: &apos;fake-org&apos;
})</code></pre><p><em>Behavior: </em>Each new subnet group will block a new CIDR block. &quot;Application&quot; and &quot;database&quot; subnets will be interspersed, making <a href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/security-group-rules-reference.html">rules</a> and configurations based on IP address ranges difficult to manage.</p><p><strong><u>Option 2:</u></strong> Define 5 &quot;application&quot; subnet groups and 2 &quot;database&quot; subnet groups in order today. The 3 subnet groups I need today will have the <code>reserved</code> property set to false (default). The others will have it set to true.</p><pre><code class="language-typescript">const vpc = new Vpc(this, &apos;lambda-vpc&apos;, {
    &apos;cidr&apos;: &quot;10.0.0.0/16&quot;,
    &apos;maxAzs&apos;: 1,
    &apos;subnetConfiguration&apos;: [{
        cidrMask: 24,
        name: &apos;application-1&apos;,
        subnetType: SubnetType.PRIVATE_WITH_NAT,
    },
    {
        cidrMask: 24,
        name: &apos;application-2&apos;,
        subnetType: SubnetType.PRIVATE_WITH_NAT
    },
    {
        cidrMask: 24,
        name: &apos;application-3&apos;,
        subnetType: SubnetType.PRIVATE_WITH_NAT,
        reserved: true
    },
    {
        cidrMask: 24,
        name: &apos;application-4&apos;,
        subnetType: SubnetType.PRIVATE_WITH_NAT,
        reserved: true
    },
    {
        cidrMask: 24,
        name: &apos;application-5&apos;,
        subnetType: SubnetType.PRIVATE_WITH_NAT,
        reserved: true
    },
    {
        cidrMask: 24,
        name: &apos;database-1&apos;,
        subnetType: SubnetType.PRIVATE
    },
    {
        cidrMask: 24,
        name: &apos;database-2&apos;,
        subnetType: SubnetType.PRIVATE,
        reserved: true
    }],
    &apos;vpcName&apos;: &apos;fake-org&apos;
})</code></pre><p><em>Behavior</em>: &quot;Application&quot; and &quot;database&quot; subnets will have their own IP address ranges. There are 4 subnet CIDR ranges that have been reserved, but have no associated resources. Over time, the 4 reserved subnets can be deployed as subnet resources.</p><hr><p>That&apos;s it for VPC subnet groups. Look for a post soon on how to deploy a VPC with security best practices using CDK.</p><p>As always, you can reach me <a href="https://twitter.com/rohchak/">@rohchak</a></p>]]></content:encoded></item><item><title><![CDATA[Lets Encrypt + Haproxy]]></title><description><![CDATA[I recently found this great docker image that encapsulates haproxy and cert renewal into a single container]]></description><link>https://rohanc.me/letsencrypt-haproxy/</link><guid isPermaLink="false">62a05b4678647100013290a4</guid><category><![CDATA[haproxy]]></category><category><![CDATA[letsencrypt]]></category><category><![CDATA[ssl]]></category><dc:creator><![CDATA[Rohan Chakravarthy]]></dc:creator><pubDate>Wed, 08 Jun 2022 07:53:26 GMT</pubDate><media:content url="https://rohanc.me/content/images/2022/06/ssl3-1.png" medium="image"/><content:encoded><![CDATA[<img src="https://rohanc.me/content/images/2022/06/ssl3-1.png" alt="Lets Encrypt + Haproxy"><p>I&apos;ve used a few different approaches for renewing the Let&apos;s Encrypt certs for my domain over the years, but I recently found <a href="https://github.com/tomdess/docker-haproxy-certbot">this great docker image</a> that encapsulates everything into a single container.</p><h2 id="steps">Steps</h2><ol><li><a href="#create-an-haproxy-cfg-file">Create an haproxy.cfg file</a></li><li><a href="#run-the-docker-container">Run the docker container</a></li></ol><h4 id="create-an-haproxy-cfg-file">Create an haproxy cfg file</h4><p>Here&apos;s my Haproxy config file, slightly modified from the one provided in the repo since I&apos;m serving my site on port 8080 locally. I&apos;ve stored it in <code>/etc/haproxy/haproxy.cfg</code></p><pre><code>global
    maxconn 20480
    ############# IMPORTANT #################################
    ## DO NOT SET CHROOT OTHERWISE YOU HAVE TO CHANGE THE  ##
    ## acme-http01-webroot.lua file                        ##
    # chroot /jail                                         ##
    #########################################################
    lua-load /etc/haproxy/acme-http01-webroot.lua
    #
    # SSL options
    ssl-default-bind-ciphers AES256+EECDH:AES256+EDH:!aNULL;
    tune.ssl.default-dh-param 4096

    # workaround for bug #14 (Cert renewal blocks HAProxy indefinitely with Websocket connections)
    hard-stop-after 3s

# DNS runt-time resolution on backend hosts
resolvers docker
    nameserver dns &quot;127.0.0.11:53&quot;

defaults
    log global
    mode http
    timeout connect 5000ms
    timeout client 50000ms
    timeout server 50000ms
    option forwardfor
    option httplog

    # never fail on address resolution
    default-server init-addr last,libc,none

frontend http
    bind *:80
    mode http
    acl url_acme_http01 path_beg /.well-known/acme-challenge/
    http-request use-service lua.acme-http01 if METH_GET url_acme_http01
    redirect scheme https code 301 if !{ ssl_fc }

frontend https
    bind *:443 ssl crt /etc/haproxy/certs/ no-sslv3 no-tls-tickets no-tlsv10 no-tlsv11
    http-response set-header Strict-Transport-Security &quot;max-age=16000000; includeSubDomains; preload;&quot;
    default_backend www

backend www
    server ghost localhost:8080 check
    http-request add-header X-Forwarded-Proto https if { ssl_fc }
</code></pre><h4 id="run-the-docker-container">Run the docker container</h4><p>Once you have the haproxy file set up, run the following command:</p><pre><code>DOMAINS=&quot;[YOUR COMMA SEPARATED LIST OF DOMAINS]&quot;
EMAIL=&quot;[YOUR EMAIL]&quot;
docker run --name haproxy -d \
    --net=&quot;host&quot; \
    -e CERTS=$DOMAINS \
    -e EMAIL=$EMAIL \
    -e STAGING=false \
    --restart=always \
    -v /home/ubuntu/haproxy:/etc/haproxy \
    -p 80:80 -p 443:443 \
    ghcr.io/tomdess/docker-haproxy-certbot:master
</code></pre><p>That&apos;s it! The container runs a cron job that checks your cert weekly and updates it if required</p>]]></content:encoded></item><item><title><![CDATA[AWS Subscription Required: The AWS Access Key Id needs a subscription for the service]]></title><description><![CDATA[I hit an odd error while bootstrapping a new account through cdk bootstrap: SubscriptionRequiredException]]></description><link>https://rohanc.me/aws-subscription-required/</link><guid isPermaLink="false">62a05b4678647100013290a3</guid><category><![CDATA[CDK]]></category><category><![CDATA[AWS]]></category><dc:creator><![CDATA[Rohan Chakravarthy]]></dc:creator><pubDate>Wed, 08 Jun 2022 07:52:14 GMT</pubDate><media:content url="https://rohanc.me/content/images/2022/06/error.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://rohanc.me/content/images/2022/06/error.jpg" alt="AWS Subscription Required: The AWS Access Key Id needs a subscription for the service"><p>I hit an odd error while bootstrapping a new account through <code>cdk bootstrap</code>:</p><pre><code>SubscriptionRequiredException: The AWS Access Key Id needs a subscription for the service
</code></pre><p>I found a few different reasons for this error, but they all essentially boil down to trying to use a feature that is not enabled in your account. I tried running some other commands but I soon realized <em><strong>all resource creation was disabled in my account</strong></em>.</p><p>I&apos;d created this account through my AWS Organization so I initially thought it might just be a delay in account setup. However, I kept seeing this error even after waiting for an hour.</p><h4 id="solution">Solution</h4><p>I started poking around my accounts, and eventually realized I had an <strong>overdue bill</strong> in the main organization account due to a cancelled credit card. <em><strong>Paying the bill immediately resolved this error!</strong></em></p><h4 id="other-potential-reasons">Other Potential Reasons</h4><p>Here are other links that cover other reasons for this error:</p><ul><li>AWS China regions don&apos;t support WAF: <a href="https://github.com/kubernetes-sigs/aws-load-balancer-controller/issues/1579">https://github.com/kubernetes-sigs/aws-load-balancer-controller/issues/1579</a></li><li>Services launched after account creation might not be enabled for your account: <a href="https://aws.amazon.com/premiumsupport/knowledge-center/error-access-service/">https://aws.amazon.com/premiumsupport/knowledge-center/error-access-service/</a></li></ul>]]></content:encoded></item><item><title><![CDATA[Moving Towards High-Value Health Care in the US (Introduction)]]></title><description><![CDATA[The US spends more on health care than any other country in the world but has the same (or worse) health outcomes. What led to this?]]></description><link>https://rohanc.me/moving-high-value-healthcare-us/</link><guid isPermaLink="false">62a05b4678647100013290a2</guid><category><![CDATA[healthcare]]></category><category><![CDATA[insurance]]></category><category><![CDATA[high-value healthcare]]></category><dc:creator><![CDATA[Rohan Chakravarthy]]></dc:creator><pubDate>Mon, 04 Jan 2021 03:02:00 GMT</pubDate><media:content url="https://github.com/rchakra3/static-assets/raw/master/high-value-healthcare-intro/stethoscope_on_white.jpg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://github.com/rchakra3/static-assets/raw/master/high-value-healthcare-intro/stethoscope_on_white.jpg" alt="Moving Towards High-Value Health Care in the US (Introduction)"><p>It&apos;s taken me way too long to do an in-depth study of the current state of healthcare in the US considering building software solutions to provide quality care is now my full time job!</p>
<p>I think we all know something is wrong with the healthcare system in the US. The US spends more on health care than any other country in the world. 1/3rd of all the funds raised via GoFundMe are for medical expenses. Despite this, health outcomes are not any better than those in other developed countries. The US is actually worse in some common health metrics like life expectancy, infant mortality, and unmanaged diabetes.</p>
<p>I&apos;ve been reading through Dave Chase&apos;s <a href="https://healthrosetta.org/ceoguide/">&quot;The CEO&apos;s guide to restoring the American Dream&quot;: How to Deliver World Class Health Care to your Employees at Half the Cost</a> and working my way through the <a href="https://www.udacity.com/course/health-informatics-in-the-cloud--ud809">Health Informatics in the Cloud</a> course so I decided to summarize what I&apos;m learning in a series of blog posts. The structure of the posts is probably going to follow that of the book, though the content is also sourced from other papers and articles (which I&apos;ve referenced), and conclusions I&apos;ve drawn from said content. I am most definitely relatively new to this space, so feel free to point out any flawed conclusions.</p>
<p>With this first post, I&apos;m going to provide some background and talk about the status quo in the US healthcare system (as I understand it).</p>
<p><strong>Disclaimer:</strong> All opinions are my own</p>
<h4 id="terminology">Terminology</h4>
<ol>
<li><strong>Medicare:</strong> Federal Health Insurance that covers people over 65 (80+% of beneficiaries), and younger people with certain disabilities and chronic conditions. Part A (Hospital Insurance) covers in-patient treatments and Part B (Medical Insurance) covers outpatient care, medical supplies and preventative care</li>
<li><strong>Acute Conditions</strong>: These are time-sensitive conditions that generally respond to treatment and which can reach a resolution. Conditions that require urgent, emergency or critical care fall into this category. This covers conditions ranging from a broken ankle to heart attacks.</li>
<li><strong>Chronic Conditions</strong>: By definition a Chronic condition is <em><strong>not curable</strong></em>. The goal while treating a chronic condition is disease management to improve the patient&apos;s quality of life. With improvements in medicine, many previously &quot;Terminal&quot; conditions are now Chronic. Examples include Diabetes, Hypertension and Cancer.</li>
</ol>
<h4 id="somecrazynumbers">Some crazy numbers</h4>
<ol>
<li>The Health Care Industry spent <em><strong>$1.2 Billion lobbying to influence the Affordable Care Act in 2009</strong></em>. That seems like a lot, but <em><strong>annual healthcare spending in the US was $3.8 Trillion in 2019</strong></em>. The lobbying money was a drop in the bucket.</li>
<li>There are 0 laws that hold healthcare organizations responsible for misdiagnosis. Meanwhile <em><strong>1/3rd of all deaths in the US are caused by medical errors</strong></em>, 5% of all diagnoses are incorrect, and &gt;20% of incorrect diagnoses cause life-altering or life-changing consequences</li>
<li>FDA approvals don&apos;t mean as much as you&apos;d expect. <em><strong>57% of cancer medication</strong></em> approved by the FDA between 2008-2012 has unknown effects on overall survival or failed to show gains in survival rates</li>
<li>Almost <em><strong>50% of the Adult Population has at least 1 chronic condition</strong></em> in the US, and 27% has 2 or more.</li>
<li>More than <em><strong>80% of adults over the age of 65 have at least 1 chronic disease</strong></em></li>
</ol>
<ul>
<li>This number is &gt;60% for 2 or more and &gt;20% for 5 or more!</li>
<li>Patients with 5 or more chronic conditions account for 67% of Medicare expenditure</li>
</ul>
<h4 id="history">History</h4>
<p>Surprisingly, the employer-provided health insurance model that is prevalent today can be traced back to a WW2 policy to prevent hyper-inflation!<br>
Up until the late 1930s, individuals paid most health-care costs out of pocket and relied on individual health insurance plans to offset any unforeseen large expenses. During WW2, the US government introduced price and wage controls in an attempt to prevent hyper-inflation. However, in a concession to placate labor groups, employer-sponsored health benefits were excluded from this wage cap. This resulted in employers offering increasingly elaborate health benefits as a means to attract employees. The IRS subsequently made all such health benefits tax exempt for both employers and employees.</p>
<p>This resulted in the current status quo, with employer-provided health insurance being the norm for the following reasons:</p>
<ol>
<li>Employers were now against any kind of reform that resulted in health benefits being taxed, since this meant that payroll taxes would go up</li>
<li>Since health benefits now covered more than just unforeseen large expenses, employees were more likely to visit doctors and hospitals. In theory, this seems ideal, but in practice this incentivizes hospital systems to increase prices since the employee often doesn&apos;t see the actual cost. Thus, Hospitals were now also incentivized to oppose any such reform.</li>
<li>Insurance Providers had a much larger pool of covered individuals, and had customers (employers) who were willing to pay higher premiums over time</li>
<li>Buying individual health insurance became more expensive than opting for an employer-provided plan</li>
</ol>
<h4 id="currentstate">Current State</h4>
<p>Over time, this move towards employer-sponsored heath care led to what is arguably an extremely broken health care system we see today.</p>
<h6 id="annualhealthcarepremiumincreases">Annual Healthcare Premium Increases</h6>
<p>Over time, annual increases in healthcare premiums became the norm. Most businesses expect a <em><strong>11-14% annual increase</strong></em> in costs, and insurance brokers take advantage of this knowledge to bump up per-employee premiums annually. In contrast, median middle class wages <em><strong>increased by ~1% annually</strong></em> from 2010-2016. <strong>The majority of an employer&apos;s per-employee payroll cost increase never reaches the employee!</strong> This doesn&apos;t even account for the fact that annual out of pocket healthcare expenses continue to increase for the employee anyway.</p>
<h6 id="morespecialtycarelessprimarycare">More Specialty Care, Less Primary Care</h6>
<p>Employee demands for benefits increased, resulting in offerings like High Deductible Health Plans(HDHPs). HDHPs are intended to incentivize consumer-driven healthcare, giving access to a huge number of providers and specialists. However, this results in patients choosing Specialists over Primary Care Physicians. Specialists dealing with Acute Conditions tend to prescribe a lot more expensive tests than a PCP, leading to increased costs.<br>
In addition, patients with Chronic health conditions see multiple specialists within the span of a year, with very little communication between them. <strong>A patient with 5 or more chronic conditions will, on average, <em>see 14 providers and fill 50 prescriptions every year</em>, for the rest of their lives</strong>. The lack of communication can lead to multiple adverse consequences, including, but not limited to, bad medication interactions and unchecked compounding effects leading to serious, acute-care episodes.<br>
This kind of single-condition, specialist-based model, along with a lack of emphasis on preventative care, is why you will often read that the US healthcare system rewards acute care over other forms of care. If the emphasis was on Primary Care, we would have a lot more preventative care, leading to fewer instances of chronic conditions caused by lifestyle factors and lower overall medical care costs.</p>
<h6 id="obfuscatedpricing">Obfuscated Pricing</h6>
<p>The current system has created a cycle of bad incentives. Hospitals charge higher prices on paper for procedures. The insurance providers &quot;negotiate&quot; down the prices, supposedly on behalf of the customer paying the insurance premium. They use the price drop as a way to prove their value to customers. However, the following is happening behind the hood:</p>
<ul>
<li>Hospitals know they are never going to see the entire on-paper price of a service from insurance, so they continue to hike the prices</li>
<li>Insurance Providers don&apos;t care about the price quoted. They only care about the actual charge, which is significantly lower after they &quot;negotiate&quot; the price down</li>
<li>The larger the drop in prices, the more the insurance provider has &quot;saved&quot; their customers</li>
<li>No matter what the final charge ends up being, the insurance provider gets a cut of that charge</li>
<li>As a result, the insurance provider is making money with higher premiums And via the cut from the payment to the hospital</li>
</ul>
<p>This is why if patients don&apos;t share their insurance information and instead ask for cash-only prices, they often see ridiculous drops in prices. The book talks about an extreme example where an MRI cost $3500 via insurance, and $475 with cash.</p>
<h6 id="impactonsmallbusinesslowerincomeemployees">Impact on Small Business &amp; Lower-Income Employees</h6>
<p>Offering healthcare benefits is now relatively expensive for smaller businesses, and not mandatory by law. As a result <strong>only about 30% of businesses with &lt;50 employees offer health benefits</strong>. This means employees with the lowest average income also end up having to pay out of pocket for health insurance.</p>
<p>Even when low-income employees (making &lt;$25k annually) end up getting company provided insurance, their premiums are comparable to those paid by high income employees OR they get fewer benefits. They are also the least likely to see any kind of tax benefits from healthcare benefits exemptions.</p>
<h4 id="regulationsorlackthereof">Regulations (or lack thereof)</h4>
<p>From my (basically outsider) perspective, there is a shockingly little regulation on some of the core contributors to the current state of health care. I&apos;m sure the health care industry&apos;s massive lobbying power has nothing to do with it.</p>
<h6 id="qualityofcare">Quality of Care</h6>
<p>Health Care providers are not accountable for the quality of care provided to their patients. Indeed, with the focus on short, acute-care episodes there was no real standard way to measure the quality. Some states, such as Ohio, have recently started down the path of assessing quality based on pre-defined episode-based care.</p>
<p>Studies have shown that in patient visits preceding hospitalizations, 20% of diagnoses were incorrect. Overall, 5% of all diagnoses are incorrect. This leads to money spent on treatments and medication that has no benefit (and could have harm), over-treatment and, in some cases, death. A Johns Hopkins study showed that medical errors are the 3rd leading cause of death in the US. Yet, there are no regulations that hold providers accountable for misdiagnosis.</p>
<p>It is important to note here that the providers themselves are under a huge amount of stress to see more patients in a shorter amount of time, and have a huge administrative burden due to hospital policies. A study showed that for every hour that a provider sees patients, they spend another 2 on administrative tasks. The System is broken.</p>
<h6 id="lackoftransparency">Lack of Transparency</h6>
<p>Insurance carriers have no legal obligation to share claims data with the employer paying the premiums. In many cases carriers refuse to share claims data, and when they do access is often provided to only a small subset of all claims. This lack of transparency prevents employers (the actual customers of this service) from performing any kind of basic discrepancy analysis. This also allows hospitals and care providers with lower quality care outcomes to charge disproportionately higher costs without any oversight.</p>
<h6 id="blatantconflictsofinterests">Blatant Conflicts of Interests</h6>
<p>The same insurance carrier can administer the health plan for an employer, as well as the hospital through which care is being provided. Hospitals are often Huge employers, which means a massive, guaranteed annual income stream for the insurance carrier. Along with the lack of transparency mentioned above, this means carriers will often not try to negotiate the price of services with these hospitals (to keep them on as clients), leading to higher premiums for other employers serviced by that carrier. This is a clear conflict of interest, since insurance carriers should technically be aiming to work in the interests of all their clients individually. In addition, there is no regulation preventing a hospital from owning a insurance carrier!</p>
<p>Insurance brokers are supposed to work on behalf of their clients to try and find the most attractive plan and insurance carrier for them. However, insurance carriers give brokers a year-end bonus based on their client retention rate. Brokers are not obligated to disclose this to their clients. Another case of a clear conflict of interest - the Broker is incentivized to get their clients to renew their plan, regardless of actual value.</p>
<p>Similarly, Benefits Managers can hire benefits &quot;Consultants&quot; paid for by insurance companies and brokers. The consultant&apos;s salary is being paid for by one of the manager&apos;s prospective health insurance options. Why would the consultant ever suggest an alternative?</p>
<p>Somehow, none of these things are required disclosures for hospitals, insurance providers or brokers.</p>
<h4 id="summary">Summary</h4>
<p>This post is already way longer than I&apos;d expected, so I&apos;m going to stop here. There are a Lot of avenues for improvement, and there are multiple case studies proving that there are approaches to health care (even with the employer-paid model) that result in better overall health while significantly reducing health care costs. Technology can definitely play an important role in cutting costs and improving efficiency, but meaningful change requires both employers and patient to make deliberate choices informed by data and case studies. The next couple of posts will aim to summarize some of the proposed alternatives to the current state.</p>
<h4 id="references">References:</h4>
<p>[1] <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1497638/pdf/15158105.pdf">&quot;The Growing Burden of Chronic Disease in America&quot;</a></p>
<p>[2] <a href="https://www.cdc.gov/pcd/issues/2020/20_0130.htm">&quot;Prevalence of Multiple Chronic Conditions Among US Adults, 2018&quot;</a></p>
<p>[3] <a href="https://www.cms.gov/mmrr/downloads/mmrr2013_003_02_b02.pdf">&quot;Medicare Payments: How Much Do Chronic Conditions Matter?&quot;</a></p>
<p>[4] <a href="https://publicintegrity.org/health/lobbyists-swarm-capitol-to-influence-health-reform/">&quot;Lobbyists swarm Capitol to influence health reform&quot;</a></p>
<p>[5] <a href="https://jamanetwork.com/journals/jamainternalmedicine/fullarticle/2463590">&quot;Cancer Drugs Approved on the Basis of a Surrogate End Point and Subsequent Overall Survival&quot;</a></p>
<p>[6] <a href="https://www.rand.org/content/dam/rand/pubs/research_briefs/2011/RAND_RB9605.pdf">&quot;How Does Growth in Health Care Costs Affect the  American Family?&quot;</a></p>
<p>[7] <a href="https://www.kff.org/other/state-indicator/firms-offering-coverage-by-size/?currentTimeframe=0&amp;selectedDistributions=firms-with-fewer-than-50-employees&amp;selectedRows=%7B%22wrapups%22:%7B%22united-states%22:%7B%7D%7D%7D&amp;sortModel=%7B%22colId%22:%22Location%22,%22sort%22:%22asc%22%7D">&quot;Private Firms offering Health Care by size&quot;</a></p>
<p>[8] <a href="https://medicaid.ohio.gov/provider/PaymentInnovation/episodes">&quot;Ohio&apos;s Episode-Based Care&quot;</a></p>
<p>[9] <a href="https://www.vox.com/policy-and-politics/2019/9/30/20891305/health-care-employer-sponsored-premiums-cost-voxcare">&quot;Health care is getting more and more expensive, and low-wage workers are bearing more of the cost&quot;</a></p>
<p>[10] <a href="https://www.investopedia.com/financial-edge/0912/which-income-class-are-you.aspx">&quot;Which income class are you?&quot;</a></p>
<p>[11] <a href="https://qualitysafety.bmj.com/content/23/9/727">&quot;The frequency of diagnostic errors in outpatient care: estimations from three large observational studies involving US adult populations&quot;</a></p>
<p>[12] <a href="https://pubmed.ncbi.nlm.nih.gov/27595430/">&quot;Allocation of Physician Time in Ambulatory Practice: A Time and Motion Study in 4 Specialties&quot;</a></p>
<p>[13] <a href="https://www.axios.com/gofundme-medical-expenses-health-care-costs-d643ed66-2a0a-464c-aad7-5f904288d60d.html">&quot;GoFundMe&apos;s place in the health care system&quot;</a></p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Interesting announcements at Ignite 2018]]></title><description><![CDATA[Interesting (Azure-related) announcements from Microsoft Ignite 2018 about IoT, Ops, Edge, Containers, Kubernetes and Serverless]]></description><link>https://rohanc.me/interesting-announcements-at-ignite-2018/</link><guid isPermaLink="false">62a05b4678647100013290a1</guid><category><![CDATA[kubernetes]]></category><category><![CDATA[azure]]></category><category><![CDATA[CI/CD]]></category><category><![CDATA[containers]]></category><category><![CDATA[ignite2018]]></category><dc:creator><![CDATA[Rohan Chakravarthy]]></dc:creator><pubDate>Mon, 05 Nov 2018 18:19:09 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>I thought I&apos;d share some of the announcements at Microsoft Ignite 2018 that I found really interesting. Obviously this is a subset of announcements, related only to products I&apos;ve used or am planning to use in the future.</p>
<h5 id="iothubedge">IoT Hub/Edge:</h5>
<ol>
<li>There&#x2019;s a <strong>Jenkins plugin</strong> for <strong>edge modules</strong> that enables builds and deployments to the devices: <a href="https://azure.microsoft.com/en-us/blog/developer-tooling-improvements-for-azure-iot-edge/">https://azure.microsoft.com/en-us/blog/developer-tooling-improvements-for-azure-iot-edge/</a></li>
</ol>
<ul>
<li><strong>IoT Edge Module Marketplace</strong>. A place for hardware manufacturers/3rd party software vendors to publish hardware specific modules: <a href="https://azure.microsoft.com/en-us/blog/publish-your-azure-iot-edge-modules-in-azure-marketplace/">https://azure.microsoft.com/en-us/blog/publish-your-azure-iot-edge-modules-in-azure-marketplace/</a></li>
<li><strong>Edge devices</strong> can now <strong>restart</strong> and establish connections to the edge hub <strong>even if the hub isn&#x2019;t connected to the internet</strong> (Still requires a one-time sync): <a href="https://azure.microsoft.com/en-us/blog/extended-offline-operation-with-azure-iot-edge/">https://azure.microsoft.com/en-us/blog/extended-offline-operation-with-azure-iot-edge/</a></li>
<li><strong>Device Twin property based routing</strong> in IoT Hub: <a href="https://azure.microsoft.com/en-us/blog/a-powerful-and-intuitive-way-to-route-device-messages-in-azure-iot-hub/">https://azure.microsoft.com/en-us/blog/a-powerful-and-intuitive-way-to-route-device-messages-in-azure-iot-hub/</a></li>
<li><strong>Digital Twins</strong> in public preview: <a href="https://azure.microsoft.com/en-us/blog/announcing-the-public-preview-of-azure-digital-twins/">https://azure.microsoft.com/en-us/blog/announcing-the-public-preview-of-azure-digital-twins/</a></li>
<li>The <strong>Device Provisioning service</strong> has much higher limits now</li>
<li><strong>Edge support for Blobs:</strong> <a href="https://docs.microsoft.com/en-us/azure/iot-edge/how-to-store-data-blob">https://docs.microsoft.com/en-us/azure/iot-edge/how-to-store-data-blob</a></li>
</ul>
<h5 id="cosmosdb">Cosmos DB:</h5>
<ol>
<li>Support for <strong>multiple masters</strong> (which allows scaling writes across regions): <a href="https://azure.microsoft.com/en-us/blog/azure-cosmos-db-database-for-intelligent-cloud-intelligent-edge-era/">https://azure.microsoft.com/en-us/blog/azure-cosmos-db-database-for-intelligent-cloud-intelligent-edge-era/</a></li>
</ol>
<ul>
<li><strong>Reserved Capacity</strong> (subscription plan vs pay-as-you-go): <a href="https://azure.microsoft.com/en-us/blog/announcing-general-availability-of-azure-cosmos-db-reserved-capacity/">https://azure.microsoft.com/en-us/blog/announcing-general-availability-of-azure-cosmos-db-reserved-capacity/</a></li>
</ul>
<h5 id="containerscontainerregistryk8s">Containers/Container Registry/K8s:</h5>
<ol>
<li>The Registry now has support for <strong>Helm Chart Repos</strong>, <strong>Docker Content Trust</strong> and <strong>ACR tasks</strong>: <a href="https://azure.microsoft.com/en-us/blog/azure-container-registry-public-preview-of-helm-chart-repositories-and-more/">https://azure.microsoft.com/en-us/blog/azure-container-registry-public-preview-of-helm-chart-repositories-and-more/</a></li>
</ol>
<ul>
<li>ACR tasks are pretty cool &#x2013; if you have multiple images dependent on a certain base image, you can trigger updated builds for all the images when you update the base image.<br>
Also &#x2013; I think it can do a lot of what pipelines can?</li>
<li><strong>Azure Container Instances (ACI)</strong> can be deployed into existing VNETs: <a href="https://azure.microsoft.com/en-us/updates/aci-vnet/">https://azure.microsoft.com/en-us/updates/aci-vnet/</a></li>
<li><strong>K8S</strong> is available on <strong>Stack</strong> in preview</li>
</ul>
<h5 id="ops">Ops:</h5>
<ol>
<li>As you may have already heard me say elsewhere, <strong>Azure Pipelines is AMAZING:</strong> <a href="https://azure.microsoft.com/en-us/blog/azure-pipelines-is-the-ci-cd-solution-for-any-language-any-platform-any-cloud/">https://azure.microsoft.com/en-us/blog/azure-pipelines-is-the-ci-cd-solution-for-any-language-any-platform-any-cloud/</a></li>
</ol>
<ul>
<li>Configuring a build is really clean. (I&apos;ll create a build using a public repo soon!)</li>
<li>Allows you to connect &#x201C;service connections&#x201D; like our docker registry to pull images from, k8s clusters, jenkins</li>
<li>Build steps can run on either the VM or within (one or more) containers</li>
<li>Allows defining multiple &#x201C;jobs&#x201D; that run in parallel</li>
<li>Extensions for deploying to AWS, Azure, K8S clusters</li>
<li><strong>Resource specific alerts</strong> (configurable alerts for platform issues. This might be useful for alerting systems): <a href="https://azure.microsoft.com/en-us/blog/get-notified-when-your-azure-resources-become-unavailable/">https://azure.microsoft.com/en-us/blog/get-notified-when-your-azure-resources-become-unavailable/</a></li>
<li><strong>Deployment Manager</strong> (for multi-stage/multi-region deployments): <a href="https://azure.microsoft.com/en-us/blog/azure-deployment-manager-now-in-public-preview/">https://azure.microsoft.com/en-us/blog/azure-deployment-manager-now-in-public-preview/</a></li>
</ul>
<h5 id="misc">Misc:</h5>
<ol>
<li><strong>Recommendation system for models based on your dataset</strong>. I&#x2019;m assuming this will work well for generic problems. Definitely something interesting to play around with!: <a href="https://azure.microsoft.com/en-us/blog/announcing-automated-ml-capability-in-azure-machine-learning/">https://azure.microsoft.com/en-us/blog/announcing-automated-ml-capability-in-azure-machine-learning/</a></li>
<li><strong>HDInsight supports Hadoop 3.0</strong>:<br>
<a href="https://azure.microsoft.com/en-us/blog/azure-hdinsight-brings-next-generation-hadoop-3-0-and-enterprise-security-to-the-cloud/">https://azure.microsoft.com/en-us/blog/azure-hdinsight-brings-next-generation-hadoop-3-0-and-enterprise-security-to-the-cloud/</a><br>
<a href="https://azure.microsoft.com/en-us/blog/deep-dive-into-azure-hdinsight-4-0/">https://azure.microsoft.com/en-us/blog/deep-dive-into-azure-hdinsight-4-0/</a></li>
<li>Azure CDN is GA: <a href="https://azure.microsoft.com/en-us/blog/microsoft-s-content-delivery-network-is-now-generally-available/">https://azure.microsoft.com/en-us/blog/microsoft-s-content-delivery-network-is-now-generally-available/</a></li>
<li><strong>Functions V2 is GA</strong>: <a href="https://azure.microsoft.com/en-us/blog/introducing-azure-functions-2-0/">https://azure.microsoft.com/en-us/blog/introducing-azure-functions-2-0/</a></li>
</ol>
<ul>
<li>Support for Java and Python is still in preview</li>
<li>Consumption plan for Linux is in preview</li>
<li><strong>Event Hubs</strong> is available on <strong>Azure Stack</strong></li>
<li><strong>Service Fabric</strong> is available on <strong>Azure Stack</strong></li>
</ul>
<p><em><strong>As always, feel free to reach out to me <a href="https://twitter.com/rohchak">@rohchak</a> if you have any questions!</strong></em></p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Generate valid signed X509 client certificates with pyopenssl]]></title><description><![CDATA[Use the pyopenssl library to generate valid signed X509 certs. Includes steps to debug invalid certs!]]></description><link>https://rohanc.me/valid-x509-certs-pyopenssl/</link><guid isPermaLink="false">62a05b4678647100013290a0</guid><category><![CDATA[pyopenssl]]></category><category><![CDATA[invalid]]></category><category><![CDATA[error]]></category><category><![CDATA[X.509]]></category><category><![CDATA[certificate]]></category><category><![CDATA[openssl]]></category><category><![CDATA[windows]]></category><dc:creator><![CDATA[Rohan Chakravarthy]]></dc:creator><pubDate>Fri, 05 Oct 2018 19:18:56 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>I spent a while today trying to figure out why a certificate deemed valid by the openssl <code>verify</code> command was invalid on my Windows machine.</p>
<p>Errors such as:</p>
<ul>
<li>&quot;This certificate has an invalid digital signature.&quot;</li>
<li>&quot;The integrity of this certificate cannot be guaranteed. The certificate may have been corrupted or may have been altered&quot;</li>
</ul>
<p>I used the <a href="https://github.com/pyca/pyopenssl">pyopenssl</a> library to generate my CA cert as well as the client certificate. You might already have an intermediate certificate and won&apos;t need to generate the CA cert. I&apos;ll add a link to working code at the end of this post. Feel free to scroll down if that&apos;s what you&apos;re interested in.</p>
<p>There are a couple of reasons why you might be seeing these errors:</p>
<ol>
<li>Public Key Length is &lt;1024 bits. See <a href="https://security.stackexchange.com/questions/65618/this-certificate-has-an-invalid-digital-signature#65626">this stackoverflow post</a> and <a href="https://morgansimonsen.com/2013/05/30/what-does-the-this-certificate-has-an-invalid-digital-signature-message-actually-mean/">this blog post for more details</a></li>
<li>Your Certificate Revocation List(CRL) Endpoints have been misconfigured or aren&apos;t reachable</li>
<li>You used the pyopenssl library and added the <code>&quot;subjectKeyIdentifier&quot; X509Extension</code> before setting the public key to use.</li>
</ol>
<p>I&apos;ll let you guess which one I hit :)</p>
<p>Here&apos;s what happened. My signed client cert generation script looked something like this:</p>
<pre><code>client_cert.add_extensions([
        crypto.X509Extension(b&quot;authorityKeyIdentifier&quot;, False, b&quot;keyid&quot;, issuer=root_ca_cert),
    ])

client_cert.add_extensions([
        crypto.X509Extension(b&quot;subjectKeyIdentifier&quot;, False, b&quot;hash&quot;, subject=client_cert),
    ])

client_cert.set_issuer(root_ca_subj)
client_cert.set_pubkey(client_key)
</code></pre>
<p>This took me a while to identify, but the client certificate generated had the extensions looking this:</p>
<pre><code>X509v3 Authority Key Identifier: 
    keyid:DA:39:A3:EE:5E:6B:4B:0D:32:55:BF:EF:95:60:18:90:AF:D8:07:09
X509v3 Subject Key Identifier: 
    DA:39:A3:EE:5E:6B:4B:0D:32:55:BF:EF:95:60:18:90:AF:D8:07:09
</code></pre>
<p>Notice how the Subject Key and Authority Key are identical? The only time that should happen is in the case of a self signed cert.</p>
<p>Digging in further this key was actually the Subject Key of my CA certificate (The one I was trying to use to sign this client cert).</p>
<p>There&apos;s more to it, but at a high level if a CA (with <strong>Subject Key ID = CA_ski</strong>) is signing a client cert, the Client cert should have:</p>
<p><em>Subject Key ID = [something unique]</em><br>
<br><br>
<em>Authority Key ID = <strong>CA_ski</strong></em></p>
<p>Alright. So how is this subject key id generated? Let&apos;s go look at the (theoretical) source of truth :). From <a href="https://datatracker.ietf.org/doc/rfc5280/?include_text=1">RFC 5280</a>:</p>
<pre><code>For end entity certificates, subject key identifiers SHOULD be derived from the public key
</code></pre>
<p>And that&apos;s when I found my bug. If you look at my code, I&apos;m setting the public key <em><strong>after</strong></em> adding the subjectKeyIdentifier extension. As a result the library seems to default to using the CA&apos;s key to generate the subjectKeyIdentifier.</p>
<p>The fix is to set the public key on the client cert <em><strong>before</strong></em> adding the subjectKeyIdentifier. So the code should look more like this:</p>
<pre><code>client_cert.set_issuer(root_ca_subj)
client_cert.set_pubkey(client_key)

client_cert.add_extensions([
        crypto.X509Extension(b&quot;authorityKeyIdentifier&quot;, False, b&quot;keyid&quot;, issuer=root_ca_cert),
    ])

client_cert.add_extensions([
        crypto.X509Extension(b&quot;subjectKeyIdentifier&quot;, False, b&quot;hash&quot;, subject=client_cert),
    ])
</code></pre>
<p>The client cert generated by this will have the right authorityKeyIdentifier set and correctly generate a unique subjectKeyIdentifier based on the client public key!</p>
<p><em><strong>Feel free to reach out to me <a href="https://twitter.com/rohchak">@rohchak</a> if you have any questions!</strong></em></p>
<p>Here&apos;s the code:</p>
<p>CA Cert gen:</p>
<script src="https://gist.github.com/rchakra3/d56456249b78208638029cad1837e192.js"></script>
<br>
<p>Signed Client Cert Gen:</p>
<script src="https://gist.github.com/rchakra3/2fd6d29e632175633f8f506c88bccbc8.js"></script>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[SSSD with Active Directory on Ubuntu]]></title><description><![CDATA[Configure SSSD to allow SSH access authenticated against an AD instance. This was required to enable advanced security features on our Ambari Hadoop cluster]]></description><link>https://rohanc.me/sssd-active-directory-ubuntu/</link><guid isPermaLink="false">62a05b46786471000132909e</guid><category><![CDATA[Active Directory]]></category><category><![CDATA[SSSD]]></category><category><![CDATA[Ubuntu]]></category><category><![CDATA[Ambari]]></category><category><![CDATA[Hadoop]]></category><dc:creator><![CDATA[Rohan Chakravarthy]]></dc:creator><pubDate>Mon, 16 Jul 2018 00:05:19 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>We&apos;re in the middle of deploying multiple Hadoop clusters with different flavors. Since many of Azure&apos;s larger customers use an on-prem Active Directory forest for authentication, extending those identities and permissions to their Hadoop clusters was an important requirement.</p>
<p>Once the Hadoop cluster&apos;s been Kerberized, various security/identity features including user group mappings require SSSD (There are other methods, but I none of them seemed as secure - for eg: LDAP requires saving credentials in a file somewhere on disk)</p>
<p>I found many different install guides for getting SSSD with Active Directory working on Centos hosts and it always seemed like something was broken when it came to following the same steps on Ubuntu. I&apos;ve included links to some of the resources I&apos;ve used, but none of them worked exactly as advertised on Ubuntu.</p>
<p>The following steps will get you a <strong>domain-joined</strong>, <strong>Ubuntu 16.04</strong> machine that allows <strong>SSH access using Active Directory credentials</strong>.</p>
<p>This guide does not include the steps to get a Kerberos Realm and KDC setup. There are many guides that go through that initial process. I&apos;ve included some of those links at the end of my post.</p>
<p>Here&apos;s a description of the variables we&apos;ll use (Pay attention to the <strong>casing</strong> in the examples):</p>
<ul>
<li>AD_DOMAIN: <strong>mydomain.local</strong></li>
<li>AD_REALM: <strong>MYDOMAIN.LOCAL</strong></li>
<li>WORKGROUP: <strong>MYDOMAIN</strong></li>
</ul>
<h6 id="installtherelevantcomponents">Install the relevant components</h6>
<p>apt install -y krb5-user samba sssd chrony</p>
<h6 id="configuresambafornetbios">Configure Samba for Netbios</h6>
<p>vim /etc/samba/smb.conf</p>
<pre><code># Delete the workgroup line and add these:
   workgroup = WORKGROUP
   client signing = yes
   client use spnego = yes
   kerberos method = secrets and keytab
   realm = AD_REALM
   security = ads
</code></pre>
<h6 id="createthesssdconffile">Create the sssd conf file</h6>
<p>vim /etc/sssd/sssd.conf</p>
<pre><code>[sssd]
services = nss, pam, ssh, autofs, pac
config_file_version = 2
domains = AD_DOMAIN
override_space = _

[domain/AD_DOMAIN]
id_provider = ad
auth_provider = ad
chpass_provider = ad
access_provider = ad
enumerate = False
krb5_realm = AD_REALM
ldap_schema = ad
ldap_id_mapping = True
cache_credentials = True
ldap_access_order = expire
ldap_account_expire_policy = ad
ldap_force_upper_case_realm = true
fallback_homedir = /home/%d/%u
default_shell = /bin/false
ldap_referrals = true
use_fully_qualified_names = False

[nss]
memcache_timeout = 3600
override_shell = /bin/bash
</code></pre>
<h6 id="setsssdconfpermissions">Set sssd conf permissions</h6>
<pre><code>chown root:root /etc/sssd/sssd.conf
chmod 600 /etc/sssd/sssd.conf
</code></pre>
<h6 id="jointhemachinetothedomain">Join the machine to the domain</h6>
<p><em><strong>You need a valid kerberos ticket for an Active Directory user with Domain Join privileges for this step</strong></em></p>
<pre><code>kinit domain_join_user@AD_REALM
net ads join -k
</code></pre>
<h6 id="ensurepamcreatesanewusershomedirectoryonsuccessfullogin">Ensure pam creates a new user&apos;s home directory on successful login</h6>
<p>vim <code>/etc/pam.d/common-session</code></p>
<pre><code># Add this line to the end
session optional                        pam_mkhomedir.so
</code></pre>
<h6 id="restartalltherelevantservices">Restart all the relevant services.</h6>
<pre><code>systemctl restart smbd.service nmbd.service
systemctl restart sssd.service
</code></pre>
<h6 id="testyourconfig">Test your config:</h6>
<pre><code>getent passwd ad_user@AD_REALM
sudo su - ad_user@AD_REALM
</code></pre>
<p>If that was successful, you&apos;re good to go! You should be able to SSH into this machine with your Active Directory credentials.</p>
<h6 id="troubleshooting">Troubleshooting:</h6>
<ul>
<li>
<p><strong>SSSD conf typo:</strong></p>
<p>If you&apos;ve been unlucky, and had a typo in your sssd conf you may have to reboot your VM in safe mode and delete the sssd.conf file before continuing with boot.</p>
</li>
</ul>
<h4 id></h4>
<ul>
<li>
<p><strong>Glitchy install:</strong></p>
<p>I&apos;ve had some machines where the install simply freezes and there&apos;s no way to successfully continue with the install. In those cases, I would recommend completely purging the installed components and restarting. Use the following commands to completely purge the installed components:</p>
</li>
</ul>
<pre><code>  apt remove --purge -y samba sssd chrony
  apt-get autoremove -y 
  apt-get purge -y samba samba-common
</code></pre>
<ul>
<li>
<p><strong>Debugging SSSD:</strong></p>
<p>Add the <code>debug_level = [1..9]</code> statement under each section in <code>sssd.conf</code> you want to debug.</p>
</li>
</ul>
<p>Here are some of the links that I&apos;ve used as a reference:</p>
<ul>
<li><a href="http://web.mit.edu/kerberos/krb5-devel/doc/admin/install_kdc.html">http://web.mit.edu/kerberos/krb5-devel/doc/admin/install_kdc.html</a></li>
<li><a href="https://github.com/HortonworksUniversity/Security_Labs/blob/master/HDP-2.6-AD.md#lab-4">https://github.com/HortonworksUniversity/Security_Labs/blob/master/HDP-2.6-AD.md#lab-4</a></li>
<li><a href="https://help.ubuntu.com/lts/serverguide/sssd-ad.html">https://help.ubuntu.com/lts/serverguide/sssd-ad.html</a></li>
</ul>
<p><em><strong>Feel free to reach out to me <a href="https://twitter.com/rohchak">@rohchak</a> if you have any questions! - chances are I&apos;ve worked my way through it :)</strong></em></p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Deploying an AWS EMR cluster with on-prem/cross-cloud Active Directory Authentication]]></title><description><![CDATA[Deploy an AWS Elastic Map Reduce cluster with Kerberos based Active Directory Authentication]]></description><link>https://rohanc.me/aws-emr-cluster-ad-auth/</link><guid isPermaLink="false">62a05b46786471000132909d</guid><category><![CDATA[AWS]]></category><category><![CDATA[EMR]]></category><category><![CDATA[VPN]]></category><category><![CDATA[Active Directory]]></category><dc:creator><![CDATA[Rohan Chakravarthy]]></dc:creator><pubDate>Mon, 09 Jul 2018 20:48:30 GMT</pubDate><media:content url="https://rohanc.me/content/images/2022/06/hadoop-3.png" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://rohanc.me/content/images/2022/06/hadoop-3.png" alt="Deploying an AWS EMR cluster with on-prem/cross-cloud Active Directory Authentication"><p>If you&apos;re ever in the <em>enviable</em> position of having to get your AWS Elastic Map Reduce (EMR) cluster authenticating against an on-prem/cross-cloud Active Directory instance this post is for you!</p>
<p>Let&apos;s break this down into the separate pieces we&apos;re going to need:</p>
<ol>
<li>
<p><strong>A VPN/Direct-Connect connection</strong> to the on-prem/cross-cloud Active Directory network</p>
</li>
<li>
<p><strong>Kerberos Authentication</strong></p>
</li>
</ol>
<p>AWS actually has all of this pretty well documented, so I&apos;m not going to list individual steps. However, I&apos;ll list a couple of gotchas that ended up taking us a couple of days to work through.</p>
<p>First, the resources:</p>
<ol>
<li>
<p><strong>Setting up a VPN connection to your AD network</strong>: <a href="https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/SetUpVPNConnections.html">https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/SetUpVPNConnections.html</a><br>
Instead of deploying an Windows Server EC2 instance, use a machine on your internal network/your router and work through the steps.</p>
</li>
<li>
<p><strong>Deploying a Kerberized EMR cluster with a cross-realm AD trust</strong>: <a href="https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-kerberos-cross-realm.html">https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-kerberos-cross-realm.html</a></p>
</li>
</ol>
<h4 id="thingstolookoutfor">Things to look out for:</h4>
<h6 id="vpn">VPN</h6>
<p>You will need to initiate a connection from your external network to the AWS VPC to activate the VPN after you configure it. The easiest way to do this is to allow incoming ICMP packets to an existing EC2 instance in your VPC and ping it.</p>
<h6 id="crossrealmtrust">Cross-Realm Trust</h6>
<ul>
<li>While specifying the DHCP option set to specify your AD DC as a DNS server, there is a line that specifies:<code>xx.xx.xx.xx,AmazonProvidedDNS</code>. You literally have to enter the string <code>AmazonProvidedDNS</code> after the IP of your DC. I&apos;ve never used a DHCP option set before so that tripped me up for bit.</li>
<li>Follow the casing exactly as specified in the post for the realms, domains and servers</li>
<li>You <strong>do not</strong> have to add individual users. The EMR deployment handles PAM and sssd configs. If your cluster has been set up correctly, you should be able to ssh into the cluster with <em>AD_username</em>@<em>yourdomain.com</em> and the user&apos;s AD password. The first time you login as a user the user&apos;s home directory is automatically created, and a kerberos ticket is requested.</li>
<li>You can absolutely configure the trust to be transitive. Not sure why their documentation specifies a non-transitive one (This is something that can be changed after initial deployment so is not a big deal)</li>
</ul>
<h6 id="slowkerberosauthtickets">Slow Kerberos auth/tickets?</h6>
<ul>
<li>Kerberos tries using UDP before TCP by default. Switching to TCP significantly sped things up. Add the following line to the [libdefaults] section<br>
<code>udp_preference_limit = 1</code>. This will prioritize TCP over UDP.</li>
</ul>
<p><em><strong>Feel free to reach out to me <a href="https://twitter.com/rohchak">@rohchak</a> if you have any questions!</strong></em></p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Monitoring Kubernetes with Prometheus 2.1+, Grafana 5.1+ and Helm]]></title><description><![CDATA[Monitoring my Kubernetes cluster and pods with a couple of helm charts and zero manual config!]]></description><link>https://rohanc.me/monitoring-kubernetes-prometheus-grafana/</link><guid isPermaLink="false">62a05b46786471000132909c</guid><category><![CDATA[kubernetes]]></category><category><![CDATA[monitoring]]></category><category><![CDATA[prometheus]]></category><category><![CDATA[grafana]]></category><category><![CDATA[helm]]></category><dc:creator><![CDATA[Rohan Chakravarthy]]></dc:creator><pubDate>Tue, 26 Jun 2018 23:05:16 GMT</pubDate><media:content url="https://rohanc.me/content/images/2022/06/Grafana-Prometheus-1.png" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://rohanc.me/content/images/2022/06/Grafana-Prometheus-1.png" alt="Monitoring Kubernetes with Prometheus 2.1+, Grafana 5.1+ and Helm"><p>I recently deployed a new Kubernetes cluster and needed to get my usual Prometheus + Grafana monitoring set up. For my last few deployments I&apos;ve used a Helm chart that is at least 6 months old, so I thought I&apos;d go with the latest and greatest this time around - and I&apos;m so glad I did!</p>
<p>There have been some really nifty improvements. My favorite is being able to specify datasources and dashboards while installing the Grafana chart. In my case this meant I could install Prometheus and configure my Grafana dashboards to use it as a datasource during install - with absolutely zero manual configuration.</p>
<p>I was planning to create a single Helm chart that installs both these charts, but Helm doesn&apos;t currently surface the extremely helpful notes from the Prometheus and Grafana charts. There&apos;s an open <a href="https://github.com/kubernetes/helm/issues/2751">Helm Issue</a> with a PR, so hopefully that gets resolved soon! Until then:</p>
<h6 id="prereqs">Prereqs</h6>
<ul>
<li>
<p>If your cluster is RBAC enabled, make sure you&apos;ve created a service account for tiller and have it bound to an appropriate role. I&apos;m generally the only one using these smaller clusters so I just take the lazy way out and bind it to the cluster admin role. <a href="https://docs.bitnami.com/kubernetes/how-to/configure-rbac-in-your-kubernetes-cluster/">This Bitnani post</a> is a great resource if you want to limit what your tiller deployment can do.</p>
</li>
<li>
<p>If your cluster <em>is not</em> RBAC enabled, be sure to disable RBAC for both the Grafana and Prometheus charts.</p>
</li>
<li>
<p>Ensure you&apos;ve updated your helm repo. This threw me off for a bit (alright, 2 hours) because the values in stable weren&apos;t what I was seeing in the Github charts repo (That&apos;s what I get for not using helm for a couple of months :/)</p>
</li>
</ul>
<h6 id="quickinstall">Quick install:</h6>
<p>First Prometheus, so we have a working datasource:<br>
<code>helm install stable/prometheus --version 6.7.4 --name my-prometheus</code></p>
<p>Next, we&apos;re going to deploy Grafana with some dashboards configured to pull data from our Prometheus instance. I&apos;ve included both the official dashboard from Prometheus as well as one that provides cluster and pod-level information:<br>
<code>helm install --name my-grafana stable/grafana --version 1.11.6 -f values.yml</code></p>
<p>Here&apos;s the values.yml file I used:</p>
<script src="https://gist.github.com/rchakra3/430c36b3ed22873e2244530a66a63e4a.js"></script>
<p>Follow the instructions from the Grafana chart notes and when you login you should see your dashboards already pulling data!</p>
<p><img src="https://github.com/rchakra3/static-assets/raw/master/grafana-prometheus/Grafana-Prometheus.png" alt="Monitoring Kubernetes with Prometheus 2.1+, Grafana 5.1+ and Helm" loading="lazy"></p>
<p><img src="https://github.com/rchakra3/static-assets/raw/master/grafana-prometheus/Prometheus_stats.png" alt="Monitoring Kubernetes with Prometheus 2.1+, Grafana 5.1+ and Helm" loading="lazy"></p>
<p><em><strong>Feel free to reach out to me <a href="https://twitter.com/rohchak">@rohchak</a> if you have any questions!</strong></em></p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Kubernetes for Everything! Part 2 - On demand Jenkins build agents]]></title><description><![CDATA[This is part 2 in a series, in which I explore spinning up on-demand build pods that run the builds, upload the artifacts and are then destroyed.]]></description><link>https://rohanc.me/kubernetes-on-demand-jenkins-build-agents/</link><guid isPermaLink="false">62a05b46786471000132909b</guid><category><![CDATA[kubernetes]]></category><category><![CDATA[azure]]></category><category><![CDATA[CI/CD]]></category><category><![CDATA[containers]]></category><category><![CDATA[windows-containers]]></category><dc:creator><![CDATA[Rohan Chakravarthy]]></dc:creator><pubDate>Mon, 31 Jul 2017 05:00:05 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>When I found out Kubernetes had support for Windows containers, I was pretty excited. I work with applications running on both Operating Systems so this opens up a lot of opportunities.</p>
<p>I plan to explore building a CI/CD pipeline that can scale based on load, set up monitoring (both cluster and application logs) and deploy both&#xA0;.NET apps in Windows containers and other apps in Linux containers&#x200A;&#x2014;&#x200A;all on Kubernetes.</p>
<p>This is part 2 in a series, in which I explore spinning up on-demand build pods that run the builds, publish the artifacts to Azure blob storage and are then destroyed. This has a couple of advantages:</p>
<ol>
<li>
<p>A clean build environment:<br>
We own a lot of .NET projects, some of which have been around for a while and use different versions of the framework. That can sometimes mean our build machines have multiple versions of Nuget, MSBuild and .NET; which has tripped up our builds more than once. This allows us to define multiple docker images, each with it&apos;s own version of the framework and associated tools. As you&apos;ll see, the base image stays the same - the only difference is in the version of .NET we install.</p>
</li>
<li>
<p>Less resource wastage:<br>
There is also the added advantage of not having Jenkins build agents just idling away, using cluster resources when there are no builds.</p>
</li>
</ol>
<h2 id="part2jenkinswithondemandagents">Part 2: Jenkins with on-demand agents</h2>
<p>This post assumes you have a Jenkins master pod deployed on your cluster already. If not, <a href="https://rohanc.me/kubernetes-for-everything-with-windows-and-linux-containers-on-azure-part-1">Part 1</a> goes through that initial setup. Let&apos;s get started!</p>
<h3 id="settingupthekubernetesplugin">Setting up the Kubernetes plugin</h3>
<p>Before you set up builds, you&apos;ll need to configure the plugin so it can talk to your cluster.</p>
<ul>
<li>Install <a href="https://wiki.jenkins.io/display/JENKINS/Kubernetes+Plugin">the plugin</a></li>
<li>Configure the plugin in global settings. The important fields here are:
<ul>
<li>Jenkins URL: the internal kubernetes service URL assigned to your Jenkins master service</li>
<li>Container Cleanup Timeout: This is the amount of time after which the plugin destroys a build pod. This one is particularly important for larger windows server images, since the larger images can take a while to pull, initialize and run the build. For me, 15 mins worked well even for some of our larger projects, but this is something you can play around with to get right.</li>
</ul>
</li>
</ul>
<h3 id="jenkinsfileforubuntubasedbuilds">Jenkinsfile for Ubuntu based builds</h3>
<p>Once you&apos;ve set up the plugin and got a multistage pipeline setup for a repository, the plugin allows for some really cool use cases. For example, it allows you to define multiple build containers in a single build pod, perform container specific actions within those containers, and pass the output to another container in the same pod. Underneath the hood, it achieves this by using shared volumes.</p>
<p>So if you decide you want to build a docker image from your latest commit and then deploy your code on a Kubernetes cluster, only if your branch is <code>master</code>, this is a valid Jenkinsfile:</p>
<script src="https://gist.github.com/rchakra3/8192c2cb58c019f2c4b121d6eb232b51.js"></script>
<p>Notice how we can run certain commands in the context of specific containers. Another important point to note is that the plugin uses the <a href="https://hub.docker.com/r/jenkinsci/jnlp-slave/">default jnlp image</a> if you don&apos;t specify a <code>containerTemplate</code> with it&apos;s name set to <code>jnlp</code>. This is an important point when we move to Windows builds.</p>
<h3 id="baseimageforwindowsbasedbuilds">Base image for Windows based builds</h3>
<p>Kubernetes currently only supports one Windows container per pod. Unfortunately, that means we can&apos;t take advantage of specialized containers within the Jenkinsfile like we did with the Ubuntu builds. Instead I built a base windowsservercore image, and then added specific packages to make them specialized. I used the <a href="https://gist.github.com/rchakra3/ac55b33f01020b0a129460d1422ac940">windows image here</a> from my previous post as the base, but with chocolatey and git installed. Chocolatey is a package for Windows that allows us to run headless package installations. Add these lines to your Dockerfile to install chocolatey:</p>
<pre><code># Install git through chocolatey and add git to the path
ENV chocolateyUseWindowsCompression false
RUN iex ((new-object net.webclient).DownloadString(&apos;https://chocolatey.org/install.ps1&apos;)); \
    choco install -v -y git
</code></pre>
<p>Using chocolatey, we can install almost any package that we&apos;d need for builds. Here&apos;s a snippet for .NET 4.5.2:</p>
<pre><code>RUN choco install netfx-4.5.2-devpack
</code></pre>
<p>MSBuild for VS2017 also has a standalone package that comes without the entire VS2017 package:</p>
<pre><code># Install msbuild (vs2017) and add to PATH
RUN Invoke-WebRequest &quot;https://aka.ms/vs/15/release/vs_BuildTools.exe&quot; -OutFile vs_BuildTools.exe -UseBasicParsing ; \
        Start-Process -FilePath &apos;vs_BuildTools.exe&apos; -ArgumentList &apos;--quiet&apos;, &apos;--norestart&apos;, &apos;--locale en-US&apos; -Wait ; \
        Remove-Item .\vs_BuildTools.exe ; \
        Remove-Item -Force -Recurse &apos;C:\Program Files (x86)\Microsoft Visual Studio\Installer&apos;
RUN setx /M PATH $($Env:PATH + &apos;;&apos; + ${Env:ProgramFiles(x86)} + &apos;\Microsoft Visual Studio\2017\BuildTools\MSBuild\15.0\Bin&apos;)
</code></pre>
<p>You can see that once we have that base image set up, everything else is as simple as adding a couple of extra packages for different build environments.</p>
<p><em><strong>Note: There seems to be a bug of some kind while mapping volumes in Kubernetes with Windows. If you set C:\Jenkins as your build folder, you&apos;ll see an error along the lines of \ContainerVolumes .. is not valid. The workaround is to mount the folder as a separate drive, and use it for your builds:</strong></em></p>
<pre><code># For some reason just using C:\Jenkins does not work - it tries to map to \ContainerVolumes in k8s. The workaround is to mount the folder as a drive and use it as the working directory for builds
RUN set-itemproperty -path &apos;HKLM:\SYSTEM\CurrentControlSet\Control\Session Manager\DOS Devices&apos; -Name &apos;G:&apos; -Value &apos;\??\C:\Jenkins&apos; -Type String
</code></pre>
<h3 id="jenkinsfileforwindowsbasedbuilds">Jenkinsfile for Windows based builds</h3>
<p>Here&apos;s an example of a Jenkinsfile that I&apos;ve used to build one of our .NET projects:</p>
<script src="https://gist.github.com/rchakra3/d940a7425578c64524332c25096a5c46.js"></script>
<p>After a successful build, it uploads the build artifact to Azure blob storage using a <a href="https://github.com/rchakra3/blogs/tree/master/k8s-windows-linux/code">small script I wrote</a>. Run the script with the <code>--help</code> flag for all the options.</p>
<h3 id="finalthoughts">Final thoughts</h3>
<p>Having never worked with headless installations in Windows before, discovering and using <a href="https://chocolatey.org">Chocolatey</a> was amazingly helpful. Although the packages come with no guarantees for production environments, I&apos;ve not had a problem with any of them so far. Kicking off builds requiring a version of the .NET framework not on our build machine was a tedious process, and this setup definitely makes that process much easier.</p>
<p>There are a lot of good examples of what you can do with the Jenkins Kubernetes plugin on their <a href="https://github.com/jenkinsci/kubernetes-plugin">Github page</a>. They&apos;re written specifically with respect to the Ubuntu jnlp image though.</p>
<p>There was a <a href="https://github.com/Azure/acs-engine/issues/959">brief bug</a> in the ACS-engine deployment of Kubernetes 1.6.6 which resulted in our windows containers not having any internet connectivity. That was frustrating, but very quickly fixed. 1.7 now has added support for managed disks on Azure, which should be interesting to play around with as well!</p>
<p><em>Next up, Monitoring!</em></p>
<p><em><strong>Feel free to reach out to me <a href="https://twitter.com/rohchak">@rohchak</a> if you have any questions!</strong></em></p>
<p>[Update 2018/06/26: <a href="https://rohanc.me/monitoring-kubernetes-prometheus-grafana/">Monitoring</a>, the post I was supposed to write a year ago]</p>
<!--kg-card-end: markdown-->]]></content:encoded></item><item><title><![CDATA[Kubernetes for Everything! (With Windows and Linux on Azure) Part 1 - Jenkins]]></title><description><![CDATA[<!--kg-card-begin: markdown--><p>When I found out Kubernetes had support for Windows containers, I was pretty excited. I work with applications running on both Operating Systems so this opens up a lot of opportunities.</p>
<p>I plan to explore building a CI/CD pipeline that can scale based on load, set up monitoring (both</p>]]></description><link>https://rohanc.me/kubernetes-for-everything-with-windows-and-linux-containers-on-azure-part-1/</link><guid isPermaLink="false">62a05b46786471000132909a</guid><category><![CDATA[azure]]></category><category><![CDATA[kubernetes]]></category><category><![CDATA[CI/CD]]></category><dc:creator><![CDATA[Rohan Chakravarthy]]></dc:creator><pubDate>Wed, 19 Jul 2017 09:36:10 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>When I found out Kubernetes had support for Windows containers, I was pretty excited. I work with applications running on both Operating Systems so this opens up a lot of opportunities.</p>
<p>I plan to explore building a CI/CD pipeline that can scale based on load, set up monitoring (both cluster and application logs) and deploy both&#xA0;.NET apps in Windows containers and other apps in Linux containers&#x200A;&#x2014;&#x200A;all on Kubernetes.</p>
<p>I hope to share what I&apos;ve learnt through these posts, starting with employing our favorite butler!</p>
<h2 id="part1jenkinsonahybridwindowslinuxkubernetescluster">Part 1: Jenkins on a hybrid Windows/Linux Kubernetes cluster</h2>
<p>In this post, I&apos;ll explain how to get a traditional Jenkins cluster with one Ubuntu and one Windows agent working. In the <a href="https://rohanc.me/kubernetes-on-demand-jenkins-build-agents/">next one</a>, I&apos;ll talk about on-demand dynamic agents that only spin up for a build, save the artifact and are then shut down - clean build environments and no wasted resources!</p>
<h3 id="deployingtheclusteronazure">Deploying the cluster on Azure</h3>
<p>Deploying a hybrid Windows/linux cluster isn&apos;t supported directly through the Azure Container Service (ACS) command line tools or the portal, so we need to generate a custom ARM template. The open source acs-engine codebase makes that really easy.</p>
<p>Just follow the instructions <a href="https://github.com/Azure/acs-engine/blob/master/docs/acsengine.md#development-in-docker">here</a> to run and build acs-engine inside a container and then <a href="https://github.com/Azure/acs-engine/blob/master/docs/acsengine.md#generating-a-template">generate the Kubernetes ARM template</a> in the <code>_output</code> folder.<br>
For reference, <a href="https://gist.github.com/rchakra3/8e080ebbcb0f11f429efe8853befb6aa">this is what my kubernetes.json file looks like</a></p>
<p>*** Note that if/when you want to update the cluster (modify an existing agent pool, add a new pool, etc) you should use the generated <code>apimodel.json</code> file instead of <code>kubernetes.json</code> so you keep all the same cert info, etc.**</p>
<h3 id="deploythejenkinsmasterpodonalinuxnode">Deploy the Jenkins master pod on a Linux node</h3>
<p>I&apos;ve used the <a href="https://hub.docker.com/_/jenkins/">official Jenkins Dockerhub image</a> for master.</p>
<p>All we need to do here is create the deployment and service in Kubernetes and optionally add a persistent volume. I also like to create storage classes to differentiate between using SSDs and HDDs.</p>
<p>(Run <code>kubectl apply -f [filename]</code> for all the YAML files)</p>
<h4 id="createthestorageclassesoptional">Create the storage classes (optional)</h4>
<script src="https://gist.github.com/rchakra3/a1816d23abf6791d123d2272e84f958b.js"></script>
<h4 id="createthepersistentstoragevolume">Create the persistent storage volume</h4>
<script src="https://gist.github.com/rchakra3/9406469cfacc994a2ef02417a171de2b.js"></script>
<h4 id="createtheserviceanddeployment">Create the service and deployment</h4>
<script src="https://gist.github.com/rchakra3/7e695cb6175b724ae3d0d4b9404cdb71.js"></script>
<script src="https://gist.github.com/rchakra3/7c09111161c0d647b91c0cceaf10c35e.js"></script>
<p>Note the <code>nodeselector</code> block</p>
<h3 id="configurejenkinsmaster">Configure Jenkins master</h3>
<ul>
<li>
<p>Figure out the name of the pod running Jenkins:<br>
<code>kubectl get pods</code></p>
</li>
<li>
<p>Get the password:</p>
</li>
</ul>
<p><code>kubectl exec [POD_NAME] cat /var/jenkins_home/secrets/initialAdminPassword</code><br>
<br><br></p>
<ul>
<li>
<p>Navigate to the IP reserved for the Jenkins-master service and enter the password.</p>
</li>
<li>
<p>Click through the rest of the setup and you&apos;re done!</p>
</li>
</ul>
<h3 id="buildthelinuxagent">Build the Linux agent</h3>
<p>Add a new Jenkins agent through the UI</p>
<div class="image-div">
![New Agent](https://cdn.rawgit.com/rchakra3/static-assets/5a2d8018/jenkins-kubernetes/permanent-agents/Jenkins-ubuntu-agent.png)
</div>
<p>Once you create the agent you&apos;ll see a screen with details on setting up an agent. We&apos;re only interested in the secret.</p>
<p>Pass the secret in as an argument to the docker build:</p>
<script src="https://gist.github.com/rchakra3/caf270ab6ae36b43821eb224e0062898.js"></script>
<br>
<h3 id="buildthewindowsagent">Build the Windows agent</h3>
<p>Use the same steps to set up a new Jenkins agent and get a new secret to pass to the windows container build:</p>
<script src="https://gist.github.com/rchakra3/ac55b33f01020b0a129460d1422ac940.js"></script>
<p><em>This will just get your cluster running. <a href="https://gist.github.com/rchakra3/30f0db04f31f381309cfc436044ba5fb">Here&apos;s a link</a> to one that installs some basic .NET build tools.</em></p>
<h3 id="createtheserviceanddeploymentforboth">Create the service and deployment for both</h3>
<p>Again, note the <code>nodeSelector</code> to ensure it gets scheduled on the right nodes (based on OS)</p>
<p>Ubuntu:</p>
<script src="https://gist.github.com/rchakra3/4bd7128669199e8d940efc720e9bc560.js"></script>
<p>Windows:</p>
<script src="https://gist.github.com/rchakra3/346418455a6099b77751fb916c55e018.js"></script>
<br>
<h3 id="finalthoughts">Final thoughts</h3>
<p>This initial setup was really not too difficult - the only thing that was an issue for me was that the windows agent connection kept timing out if I set <code>JENKINS_JNLP_URL</code> to the public IP. As soon as I set it to the internal Kubernetes service IP things started running smoothly.</p>
<p>I&apos;m excited to see how everything else works out!</p>
<p><em>Next up - dynamic, on-demand Windows/Linux agents using the Jenkins Kubernetes plugin!</em></p>
<p><em><strong>Feel free to reach out to me <a href="https://twitter.com/rohchak">@rohchak</a> if you have any questions!</strong></em></p>
<p>Update: <a href="https://rohanc.me/kubernetes-on-demand-jenkins-build-agents/">Here&apos;s a link to Part 2</a>!</p>
<!--kg-card-end: markdown-->]]></content:encoded></item></channel></rss>