A demo of running the Supervisor and Concierge on
a kind cluster. Can be used to quickly set up an
environment for manual testing.
Also added some missing copyright headers to other
hack scripts.
These commits include security fixes (CVE-2021-3121) for code generated by github.com/gogo/protobuf.
We expect this fix to also land in v1.20.6, but we don't want to wait for it.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This test could flake if the load balancer hostname was provisioned but is not yet resolving in DNS from the test process.
The fix is to retry this step for up to 5 minutes.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This test could fail when the cluster was under heavy load. This could cause kubectl to emit "Throttling request took [...]" logs that triggered a failure in the test.
The fix is to ignore these innocuous warnings.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
We had this code that printed out pod logs when certain tests failed, but it is a bit cumbersome. We're removing it because we added a CI task that exports all pod logs after every CI run, which accomplishes the same thing and provides us a bunch more data.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This allows setting `$PINNIPED_TEST_CLI` to point at an existing `pinniped` CLI binary instead of having the test build one on-the-fly. This is more efficient when you're running the tests across many clusters as we do in CI.
Building the CLI from scratch in our CI environment takes 1.5-2 minutes, so this change should save nearly that much time on every test job.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
We've seen some test flakes caused by this test. Some small changes:
- Use a 30s timeout for each iteration of the test loop (so each iteration needs to check or fail more quickly).
- Log a bit more during the checks so we can diagnose what's going on.
- Increase the overall timeout from one minute to five minutes
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- a credential that is understood by -> a credential that can be used to
authenticate to
- This is more neutral to whether its going directly to k8s
or through the impersonation proxy
Instead of using the LongRunningFunc to determine if we can safely
use http2, follow the same logic as the aggregation proxy and only
use http2 when the request is not an upgrade.
Signed-off-by: Monis Khan <mok@vmware.com>
In the case where we are using middleware (e.g., when the api group is
different) in our kubeclient, these error messages have a "...middleware request
for..." bit in the middle.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
When the frontend connection to our proxy is closed, the proxy falls through to
a panic(), which means the HTTP handler goroutine is killed, so we were not
seeing this log statement.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This is probably a good idea regardless, but it also avoids an infinite recursion from IntegrationEnv() -> assertNoRestartsDuringTest() -> NewKubeclient() -> IntegrationEnv() -> ...
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This test could flake in some rare scenarios. This change adds a bunch of retries, improves the debugging output if the tests fail, and puts all of the subtests in parallel which saves ~10s on my local machine.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This test has occasionally flaked because it only waited for the APIService GET to finish, but did not wait for the controller to successfully update the target object.
The new code should be more patient and allow the controller up to 10s to perform the expected action.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This new capability describes whether a cluster is expected to allow anonymous requests (most do since k8s 1.6.x, but AKS has it disabled).
This commit also contains new capability YAML files for AKS and EKS, mostly to document publicly how we expect our tests to function in those environments.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
It takes a lot of manual steps to get ready to manually test the
impersonation proxy on a kind cluster, which makes it error prone,
so encapsulate them into a script to make it easier.
At the end of the test, wait for the KubeClusterSigningCertificate
strategy on the CredentialIssuer to go back to being healthy, to avoid
polluting other integration tests which follow this one.
This is the configuration for https://pre-commit.com/, which now also runs golangci-lint using the same version as CI (currently v1.33.0).
Signed-off-by: Matt Moyer <moyerm@vmware.com>
We were previously issuing both client certs and server certs with
both extended key usages included. Split the Issue*() methods into
separate methods for issuing server certs versus client certs so
they can have different extended key usages tailored for each use
case.
Also took the opportunity to clean up the parameters of the Issue*()
methods and New() methods to more closely match how we prefer to call
them. We were always only passing the common name part of the
pkix.Name to New(), so now the New() method just takes the common name
as a string. When making a server cert, we don't need to set the
deprecated common name field, so remove that param. When making a client
cert, we're always making it in the format expected by the Kube API
server, so just accept the username and group as parameters directly.
I'm kinda surprised this is working with our current implementation of the
impersonator, but regardless this seems like a step forward.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
The impersonator_test.go unit test now starts the impersonation
server and makes real HTTP requests against it using client-go.
It is backed by a fake Kube API server.
The CA IssuePEM() method was missing the argument to allow a slice
of IP addresses to be passed in.
These tests occasionally flake because of a conflict error such as:
```
supervisor_discovery_test.go:105:
Error Trace: supervisor_discovery_test.go:587
supervisor_discovery_test.go:105
Error: Received unexpected error:
Operation cannot be fulfilled on federationdomains.config.supervisor.pinniped.dev "test-oidc-provider-lvjfw": the object has been modified; please apply your changes to the latest version and try again
Test: TestSupervisorOIDCDiscovery
```
These retries should improve the reliability of the tests.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Also make each t.Run use its own namespace to slight reduce the
interdependency between them.
Use t.Cleanup instead of defer in whoami_test.go just to be consistent
with other integration tests.
The same coverage that was supplied by
TestCredentialRequest_OtherwiseValidRequestWithRealTokenShouldFailWhenTheClusterIsNotCapable
is now provided by an assertion at the end of TestImpersonationProxy,
so delete the duplicate test which was failing on GKE because the
impersonation proxy is now active by default on GKE.
When testing that the impersonation proxy port was closed there
is no need to include credentials in the request. At the point when
we want to test that the impersonation proxy port is closed, it is
possible that we cannot perform a TokenCredentialRequest to get a
credential either.
Also add a new assertion that the TokenCredentialRequest stops handing
out credentials on clusters which have no successful strategies.
Signed-off-by: Monis Khan <mok@vmware.com>
To make an impersonation request, first make a TokenCredentialRequest
to get a certificate. That cert will either be issued by the Kube
API server's CA or by a new CA specific to the impersonator. Either
way, you can then make a request to the impersonator and present
that client cert for auth and the impersonator will accept it and
make the impesonation call on your behalf.
The impersonator http handler now borrows some Kube library code
to handle request processing. This will allow us to more closely
mimic the behavior of a real API server, e.g. the client cert
auth will work exactly like the real API server.
Signed-off-by: Monis Khan <mok@vmware.com>
It also works on the slightly older MacOS Catalina.
This script is only used on development laptops, so hopefully
this will work for more laptop OS's now.
Signed-off-by: Ryan Richard <richardry@vmware.com>
This adds two new flags to "pinniped get kubeconfig": --skip-validation and --timeout.
By default, at the end of the kubeconfig generation process, we validate that we can reach the configured cluster. In the future this might also validate that the TokenCredentialRequest API is running, but for not it just verifies that the DNS name resolves, and the TLS connection is available on the given port.
If there is an error during this check, we block and retry for up to 10 minutes. This duration can be changed with --timeout an the entire process can be skipped with --skip-validation.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This makes output that's easier to copy-paste into the test. We could also make it ignore the order of key/value pairs in the future.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
The thing we're waiting for is mostly that DNS is resolving, the ELB is listening, and connections are making it to the proxy.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
All controller unit tests were accidentally using a timeout context
for the informers, instead of a cancel context which stays alive until
each test is completely finished. There is no reason to risk
unpredictable behavior of a timeout being reached during an individual
test, even though with the previous 3 second timeout it could only be
reached on a machine which is running orders of magnitude slower than
usual, since each test usually runs in about 100-300 ms. Unfortunately,
sometimes our CI workers might get that slow.
This sparked a review of other usages of timeout contexts in other
tests, and all of them were increased to a minimum value of 1 minute,
under the rule of thumb that our tests will be more reliable on slow
machines if they "pass fast and fail slow".
In impersonator_config_test.go, instead of waiting for the resource
version to appear in the informers, wait for the actual object to
appear.
This is an attempt to resolve flaky failures that only happen in CI,
but it also cleans up the test a bit by avoiding inventing fake resource
version numbers all over the test.
Signed-off-by: Monis Khan <mok@vmware.com>
- Use `Eventually` when making tls connections because the production
code's handling of starting and stopping the TLS server port
has some async behavior.
- Don't use resource version "0" because that has special meaning
in the informer libraries.
This time, don't use the Squid proxy if the cluster supports real external load balancers (as in EKS/GKE/AKS).
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Because otherwise `go test` will panic/crash your test if it takes
longer than 10 minutes, which is an annoying way for an integration
test to fail since it skips all of the t.Cleanup's.
This updates our issuerconfig.UpdateStrategy to sort strategies according to a weighted preference.
The TokenCredentialRequest API strategy is preffered, followed by impersonation proxy, followed by any other unknown types.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- This commit does not include the updates that we plan to make to
the `status.strategies[].frontend` field of the CredentialIssuer.
That will come in a future commit.
This is more than an automatic merge. It also includes a rewrite of the CredentialIssuer API impersonation proxy fields using the new structure, and updates to the CLI to account for that new API.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
I think this is another aspect of the test flakes we're trying to fix. This matters especially for the "Multiple Pinnipeds" test environment where two copies of the test suite are running concurrently.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
If the test is run immediately after the Concierge is installed, the API server can still have broken discovery data and return an error on the first call.
This commit adds a retry loop to attempt this first kubectl command for up to 60s before declaring failure.
The subsequent tests should be covered by this as well since they are not run in parallel.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
These controllers were a bit inconsistent. There were cases where the controllers ran out of the expected order and the custom labels might not have been applied.
We should still plan to remove this label handling or move responsibility into the middleware layer, but this avoids any regression.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This field is a new tagged-union style field that describes how clients can connect using each successful strategy.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
We don't support using the impersonate headers through the impersonation
proxy yet, so this integration test is a negative test which asserts
that we get an error.
- The CA cert will end up in the end user's kubeconfig on their client
machine, so if it changes they would need to fetch the new one and
update their kubeconfig. Therefore, we should avoid changing it as
much as possible.
- Now the controller writes the CA to a different Secret. It writes both
the cert and the key so it can reuse them to create more TLS
certificates in the future.
- For now, it only needs to make more TLS certificates if the old
TLS cert Secret gets deleted or updated to be invalid. This allows
for manual rotation of the TLS certs by simply deleting the Secret.
In the future, we may want to implement some kind of auto rotation.
- For now, rotation of both the CA and TLS certs will also happen if
you manually delete the CA Secret. However, this would cause the end
users to immediately need to get the new CA into their kubeconfig,
so this is not as elegant as a normal rotation flow where you would
have a window of time where you have more than one CA.
Also fixes our sitemap to have correct `lastmod` times when built locally (it was already correct on Netlify).
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Should work on cluster which have:
- load balancers not supported, has squid proxy (e.g. kind)
- load balancers supported, has squid proxy (e.g. EKS)
- load balancers supported, no squid proxy (e.g. GKE)
When testing with a load balancer, call the impersonation proxy through
the load balancer.
Also, added a new library.RequireNeverWithoutError() helper.
Signed-off-by: Margo Crawford <margaretc@vmware.com>
This flag selects a CredentialIssuer to use when detecting what mode the Concierge is in on a cluster. If not specified, the command will look for a single CredentialIssuer. If there are multiple, then the flag is required.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Also update concierge_impersonation_proxy_test.go integration test
to use real TLS when calling the impersonator.
Signed-off-by: Ryan Richard <richardry@vmware.com>
These are prone to breaking when stdr is upgraded because they rely on the exact ordering of keys in the log message. If we have more problems we can rewrite the assertions to be more robust, but for this time I'm just fixing them to match the new output.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
The login commands now expect either `--concierge-mode ImpersonationProxy` or `--concierge-mode TokenCredentialRequestAPI` (the default).
This is partly a style choice, but I also think it helps in case we need to add a third major mode of operation at some point.
I also cleaned up some other minor style items in the help text.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Adds a new optional `spec.impersonationProxyInfo` field to hold the URL and CA data for the impersonation proxy, as well as some additional status condition constants for describing the current status of the impersonation proxy.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
These are some more changes that came up when Pablo and I were reviewing the previous docs PR.
In no particular order:
- Fix "related posts" on the blog section, and hide the section if there are none.
- Minor style changes to several pages (guided by various style guides).
- Redirect the root of get.pinniped.dev to our main page (shouldn't really be hit, but it's nice to do something).
- Add more mobile-friendly CSS for our docs.
- Reword the "getting started" CTA, and hide it on the docs pages (you're already there).
- Fix the "Learn how Pinniped provides identity services to Kubernetes" link on the landing page.
- Add a date to our blog post cards.
- Rewrite the hero text on the landing page.
- Fix the docs link for the "Get Started with Pinniped" button on the landing page.
- Rework the landing page grid text.
- Add Margo and Nanci to the team section and sort it alphabetically.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Also:
- Shut down the informer correctly in
concierge_impersonation_proxy_test.go
- Remove the t.Failed() checks which avoid cleaning up after failed
tests. This was inconsistent with how most of the tests work, and
left cruft on clusters when a test failed.
Signed-off-by: Ryan Richard <richardry@vmware.com>
Also:
- Changed base64 encoding of impersonator bearer tokens to use
`base64.StdEncoding` to make it easier for users to manually
create a token using the unix `base64` command
- Test the headers which are and are not passed through to the Kube API
by the impersonator more carefully in the unit tests
- More WIP on concierge_impersonation_proxy_test.go
Signed-off-by: Margo Crawford <margaretc@vmware.com>
This change adds a new virtual aggregated API that can be used by
any user to echo back who they are currently authenticated as. This
has general utility to end users and can be used in tests to
validate if authentication was successful.
Signed-off-by: Monis Khan <mok@vmware.com>
- Because the impersonation proxy config controller needs to be able
to delete the load balancer which it created
Signed-off-by: Margo Crawford <margaretc@vmware.com>
This is a more reliable way to determine whether the load balancer
is already running.
Also added more unit tests for the load balancer.
Signed-off-by: Ryan Richard <richardry@vmware.com>
Makes most of the fonts a bit bigger, increases contrast, fixes some nits about the spacing in numbered/bulletted lists, and adds some image alt texts.
Overall this improves our Lighthouse accessibility score from 71 to 95 and I think it's subjectively more readable.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This wasn't needed before because the other code wasn't in the main module and golangci-lint won't cross a module boundary.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
If someone has already set impersonation headers in their request, then
we should fail loudly so the client knows that its existing impersonation
headers will not work.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
I think we were assuming the name of our Concierge app, and getting lucky
because it was the name we use when testing locally (but not in CI).
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- Watch a configmap to read the configuration of the impersonation
proxy and reconcile it.
- Implements "auto" mode by querying the API for control plane nodes.
- WIP: does not create a load balancer or proper TLS certificates yet.
Those will come in future commits.
Signed-off-by: Margo Crawford <margaretc@vmware.com>
This is a partial revert of 288d9c999e. For some reason it didn't occur to me
that we could do it this way earlier. Whoops.
This also contains a middleware update: mutation funcs can return an error now
and short-circuit the rest of the request/response flow. The idea here is that
if someone is configuring their kubeclient to use middleware, they are agreeing
to a narrow-er client contract by doing so (e.g., their TokenCredentialRequest's
must have an Spec.Authenticator.APIGroup set).
I also updated some internal/groupsuffix tests to be more realistic.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
I think the reason we were seeing flakes here is because the kube cert agent
pods had not reached a steady state even though our test assertions passed, so
the test would proceed immediately and run more assertions on top of a weird
state of the kube cert agent pods.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
I added that test helper to create an http.Request since I wanted to properly
initialize the http.Request's context.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This allows us to keep all of our resources in the pinniped category
while not having kubectl return errors for calls such as:
kubectl get pinniped -A
Signed-off-by: Monis Khan <mok@vmware.com>
As of upgrading to Kubernetes 1.20, our aggregated API server nows runs some
controllers for the two flowcontrol.apiserver.k8s.io resources in the title of
this commit, so it needs RBAC to read them.
This should get rid of the following error messages in our Concierge logs:
Failed to watch *v1beta1.FlowSchema: failed to list *v1beta1.FlowSchema: flowschemas.flowcontrol.apiserver.k8s.io is forbidden: User "system:serviceaccount:concierge:concierge" cannot list resource "flowschemas" in API group "flowcontrol.apiserver.k8s.io" at the cluster scope
Failed to watch *v1beta1.PriorityLevelConfiguration: failed to list *v1beta1.PriorityLevelConfiguration: prioritylevelconfigurations.flowcontrol.apiserver.k8s.io is forbidden: User "system:serviceaccount:concierge:concierge" cannot list resource "prioritylevelconfigurations" in API group "flowcontrol.apiserver.k8s.io" at the cluster scope
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
I messed this up before because the ordering of the path components is a bit different than in the specific version case.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
When the Pinniped server has been installed with the `api_group_suffix`
option, for example using `mysuffix.com`, then clients who would like to
submit a `TokenCredentialRequest` to the server should set the
`Spec.Authenticator.APIGroup` field as `authentication.concierge.mysuffix.com`.
This makes more sense from the client's point of view than using the
default `authentication.concierge.pinniped.dev` because
`authentication.concierge.mysuffix.com` is the name of the API group
that they can observe their cluster and `authentication.concierge.pinniped.dev`
does not exist as an API group on their cluster.
This commit includes both the client and server-side changes to make
this work, as well as integration test updates.
Co-authored-by: Andrew Keesler <akeesler@vmware.com>
Co-authored-by: Ryan Richard <richardry@vmware.com>
Co-authored-by: Margo Crawford <margaretc@vmware.com>
Makes it easy to deploy Pinniped under a different API group for manual
testing and iterating on integration tests on your laptop.
Signed-off-by: Ryan Richard <richardry@vmware.com>
Because it is a test of the conciergeclient package, and the naming
convention for integration test files is supervisor_*_test.go,
concierge_*_test.go, or cli_*_test.go to identify which component
the test is primarily covering.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This makes sure that if our clients ever send types with the wrong
group, the server will refuse to decode it.
Signed-off-by: Monis Khan <mok@vmware.com>
- I realized that the hardcoded fakekubeapi 404 not found response was invalid,
so we were getting a default error message. I fixed it so the tests follow a
higher fidelity code path.
- I caved and added a test for making sure the request body was always closed,
and believe it or not, we were double closing a body. I don't *think* this will
matter in production, since client-go will pass us ioutil.NopReader()'s, but
at least we know now.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Yes, this is a huge commit.
The middleware allows you to customize the API groups of all of the
*.pinniped.dev API groups.
Some notes about other small things in this commit:
- We removed the internal/client package in favor of pkg/conciergeclient. The
two packages do basically the same thing. I don't think we use the former
anymore.
- We re-enabled cluster-scoped owner assertions in the integration tests.
This code was added in internal/ownerref. See a0546942 for when this
assertion was removed.
- Note: the middlware code is in charge of restoring the GV of a request object,
so we should never need to write mutations that do that.
- We updated the supervisor secret generation to no longer manually set an owner
reference to the deployment since the middleware code now does this. I think we
still need some way to make an initial event for the secret generator
controller, which involves knowing the namespace and the name of the generated
secret, so I still wired the deployment through. We could use a namespace/name
tuple here, but I was lazy.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Co-authored-by: Ryan Richard <richardry@vmware.com>
This was generated via `hugo gen chromastyles --style=monokailight > ./site/themes/pinniped/assets/scss/_syntax.css`.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
We have these redirects set up to make the `kubectl apply -f [...]` commands cleaner, but we never went back and fixed up the documentation to use them until now.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This project overwrote the v1.0.0 tag with a different commit ID, which has caused issues with the Go module sum DB (which accurately detected the issue).
This has been one of the reasons why Dependabot is not updating our Go dependencies.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This optimizes our image in a few different ways:
- It adds a bunch of files and directories to the `.dockerignore` file.
This lets us have a single `COPY . .` but still be very aggressive about pruning what files end up in the build context.
- It adds build-time cache mounts to the `go build` commands using BuildKit's `--mount=type=cache` flag.
This requires BuildKit-capable Docker, but means that our Go builds can all be incremental builds.
This replaces the previous flow we had where we needed to split out `go mod download`.
- Instead of letting the full `apt-get install ca-certificates` layer end up in our final image, we copy just the single file we need.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
We need this in CI when we want to configure Dex with the redirect URI for both
primary and secondary deploys at one time (since we only stand up Dex once).
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
I didn't advertise this feature in the deploy README's since (hopefully) not
many people will want to use it?
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Previously, when triggering a Tilt reload via a *.go file change, a reload would
take ~13 seconds and we would see this error message in the Tilt logs for each
component.
Live Update failed with unexpected error:
command terminated with exit code 2
Falling back to a full image build + deploy
Now, Tilt should reload images a lot faster (~3 seconds) since we are running
the images as root.
Note! Reloading the Concierge component still takes ~13 seconds because there
are 2 containers running in the Concierge namespace that use the Concierge
image: the main Concierge app and the kube cert agent pod. Tilt can't live
reload both of these at once, so the reload takes longer and we see this error
message.
Will not perform Live Update because:
Error retrieving container info: can only get container info for a single pod; image target image:image/concierge has 2 pods
Falling back to a full image build + deploy
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
The group claims read from the session cache file are loaded as `[]interface{}` (slice of empty interfaces) so when we previously did a `groups, _ := idTokenClaims[oidc.DownstreamGroupsClaim].([]string)`, then `groups` would always end up nil.
The solution I tried here was to convert the expected value to also be `[]interface{}` so that `require.Equal(t, ...)` does the right thing.
This bug only showed up in our acceptance environnment against Okta, since we don't have any other integration test coverage with IDPs that pass a groups claim.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
We were seeing a race in this test code since the require.NoError() and
require.Eventually() would write to the same testing.T state on separate
goroutines. Hopefully this helper function should cover the cases when we want
to require.NoError() inside a require.Eventually() without causing a race.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Co-authored-by: Margo Crawford <margaretc@vmware.com>
Co-authored-by: Monis Khan <i@monis.app>
This reverts commit 4a28d1f800.
This commit was originally made to fix a bug that caused TokenCredentialRequest
to become slow when the server was idle for an extended period of time. This was
to address a Kubernetes issue that was fixed in 1.19.5 and onward. We are now
running with Kubernetes 1.20, so we should be able to pick up this fix.
This change updates our clients to always set an owner ref when:
1. The operation is a create
2. The object does not already have an owner ref set
Signed-off-by: Monis Khan <mok@vmware.com>
See comment. This is at least a first step to make our GKE acceptance
environment greener. Previously, this test assumed that the Pinniped-under-test
had been deployed in (roughly) the last 10 minutes, which is not an assumption
that we make anywhere else in the integration test suite.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- JWKSWriterController
- JWKSObserverController
- FederationDomainSecretsController for HMAC keys
- FederationDomainSecretsController for state signature key
- FederationDomainSecretsController for state encryption key
Signed-off-by: Ryan Richard <richardry@vmware.com>
- Only sync on add/update of secrets in the same namespace which
have the "storage.pinniped.dev/garbage-collect-after" annotation, and
also during a full resync of the informer whenever secrets in the
same namespace with that annotation exist.
- Ignore deleted secrets to avoid having this controller trigger itself
unnecessarily when it deletes a secret. This controller is never
interested in deleted secrets, since its only job is to delete
existing secrets.
- No change to the self-imposed rate limit logic. That still applies
because secrets with this annotation will be created and updated
regularly while the system is running (not just during rare system
configuration steps).
I'm not sure if these docs are used anywhere in our website, but I don't think
that they are. I'm assuming someone or something will yell if these should not
be deleted. These docs also live at the root of the repo, and the duplicate
versions are already drifting out of sync from one another.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
We stared at this very carefully and we don't think there are any structural changes. Maybe something small happened to get the RNG off by one?
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This implementation is janky because I wanted to make the smallest change
possible to try to get the code back to stable so we can release.
Also deep copy an object so we aren't mutating the cache.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This is a bit more clear. We're changing this now because it is a non-backwards-compatible change that we can make now since none of this RFC8693 token exchange stuff has been released yet.
There is also a small typo fix in some flag usages (s/RF8693/RFC8693/)
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- The overall timeout for logins is increased to 90 minutes.
- The timeout for token refresh is increased from 30 seconds to 60 seconds to be a bit more tolerant of extremely slow networks.
- A new, matching timeout of 60 seconds has been added for the OIDC discovery, auth code exchange, and RFC8693 token exchange operations.
The new code uses the `http.Client.Timeout` field rather than managing contexts on individual requests. This is easier because the OIDC package stores a context at creation time and tries to use it later when performing key refresh operations.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Fosite overrides the `Cache-Control` header we set, which is basically fine even though it's not exactly what we want.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
It would be great to do this for the supervisor's callback endpoint as well, but it's difficult to get at those since the request happens inside the spawned browser.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
From RFC2616 (https://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2):
> It MUST be possible to combine the multiple header fields into one "field-name: field-value" pair,
> without changing the semantics of the message, by appending each subsequent field-value to the first,
> each separated by a comma.
This was correct before, but this simplifes a bit and shaves off a few bytes from the response.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
The bug itself has to do with when headers are streamed to the client. Once a wrapped handler has sent any bytes to the `http.ResponseWriter`, the value of the map returned from `w.Header()` no longer matters for the response. The fix is fairly trivial, which is to add those response headers before invoking the wrapped handler.
The existing unit test didn't catch this due to limitations in `httptest.NewRecorder()`. It is now replaced with a new test that runs a full HTTP test server, which catches the previous bug.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Because the library that we are using which returns that error
formats the timestamp in localtime, which is LMT when running
on a laptop, but is UTC when running in CI.
Signed-off-by: Ryan Richard <richardry@vmware.com>
This reverts commit be4e34d0c0.
Roll back this change that was supposed to make the test more robust. If we
retry multiple token exchanges with the same auth code, of course we are going
to get failures on the second try onwards because the auth code was invalidated
on the first try.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
The value is correctly validated as `secrets.pinniped.dev/oidc-client` elsewhere, only this comment was wrong.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- Refactor the test to avoid testing a private method and instead
always test the results of running the controller.
- Also remove the `if testing.Short()` check because it will always
be short when running unit tests. This prevented the unit test
from ever running, both locally and in CI.
Signed-off-by: Ryan Richard <richardry@vmware.com>
It now tests both the deprecated `pinniped get-kubeconfig` and the new `pinniped get kubeconfig --static-token` flows.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This is extracted from library.NewClientsetForKubeConfig(). It is useful so you can assert properties of the loaded, parsed kubeconfig.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- Adds two new subcommands: `pinniped get kubeconfig` and `pinniped login static`
- Adds concierge support to `pinniped login oidc`.
- Adds back wrapper commands for the now deprecated `pinniped get-kubeconfig` and `pinniped exchange-credential` commands. These now wrap `pinniped get kubeconfig` and `pinniped login static` respectively.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This is a slighly evolved version of our previous client package, exported to be public and refactored to use functional options for API maintainability.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
I hope this will make TestSupervisorLogin less flaky. There are some instances
where the front half of the OIDC login flow happens so fast that the JWKS
controller doesn't have time to properly generate an asymmetric key.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
I saw this message in our CI logs, which led me to this fix.
could not update status: OIDCProvider.config.supervisor.pinniped.dev "acceptance-provider" is invalid: status.status: Unsupported value: "SameIssuerHostMustUseSameSecret": supported values: "Success", "Duplicate", "Invalid"
Also - correct an integration test error message that was misleading.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
We believe this API is more forwards compatible with future secrets management
use cases. The implementation is a cry for help, but I was trying to follow the
previously established pattern of encapsulating the secret generation
functionality to a single group of packages.
This commit makes a breaking change to the current OIDCProvider API, but that
OIDCProvider API was added after the latest release, so it is technically still
in development until we release, and therefore we can continue to thrash on it.
I also took this opportunity to make some things private that didn't need to be
public.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This forced us to add labels to the CSRF cookie secret, just as we do
for other Supervisor secrets. Yay tests.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- AudienceMatchingStrategy: we want to use the default matcher from
fosite, so remove that line
- AllowedPromptValues: We can use the default if we add a small
change to the auth_handler.go to account for it (in a future commit)
- MinParameterEntropy: Use the fosite default to make it more likely
that off the shelf OIDC clients can work with the supervisor
Signed-off-by: Ryan Richard <richardry@vmware.com>
- Also add more log statements to the controller
- Also have the controller apply a rate limit to itself, to avoid
having a very chatty controller that runs way more often than is
needed.
- Also add an integration test for the controller's behavior.
Signed-off-by: Margo Crawford <margaretc@vmware.com>
When we try to decode with the wrong decryption key, we could get any number of
error messages, depending on what failure mode we are in (couldn't authenticate
plaintext after decryption, couldn't deserialize, etc.). This change makes the
test weaker, but at least we know we will get an error message in the case where
the decryption key is wrong.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This also sets the CSRF cookie Secret's OwnerReference to the Pod's grandparent
Deployment so that when the Deployment is cleaned up, then the Secret is as
well.
Obviously this controller implementation has a lot of issues, but it will at
least get us started.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
There is still a test failing, but I am sure it is a simple fix hiding in the
code. I think this is the general shape of the controller that we want.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Note that we don't cache the securecookie.SecureCookie that we use in our
implementation. This was purely because of laziness. We should think about
caching this value in the future.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- Make it more likely that the end user will get the more specific error
message saying that their refresh token has expired the first time
that they try to use an expired refresh token
Signed-off-by: Ryan Richard <richardry@vmware.com>
- This struct represents the configuration of all timeouts. These
timeouts are all interrelated to declare them all in one place.
This should also make it easier to allow the user to override
our defaults if we would like to implement such a feature in the
future.
Signed-off-by: Margo Crawford <margaretc@vmware.com>
Before this, we weren't properly parsing the `Content-Type` header. This breaks in integration with the Supervisor since it sends an extra encoding parameter like `application/json;charset=UTF-8`.
This change switches to properly parsing with the `mime.ParseMediaType` function, and adds test cases to match the supervisor behavior.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This default matches the static client we have defined in the supervisor, which will be the correct value in most cases.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
I think this should be more correct. In the server we're authenticating the request primarily via the `subject_token` parameter anyway, and Fosite needs the `client_id` to be set.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
kubectl 1.20 prints "Kubernetes control plane" instead of "Kubernetes master".
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- This is to make it easier for the token exchange branch to also edit
this test without causing a lot of merge conflicts with the
refresh token branch, to enable parallel development of closely
related stories.
- This refactor will allow us to add new test tables for the
refresh and token exchange requests, which both must come after
an initial successful authcode exchange has already happened
Signed-off-by: Margo Crawford <margaretc@vmware.com>
We decided that we don't really need these in every case, since we'll be returning username and groups in a custom claim.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This refactors the `UpstreamOIDCIdentityProviderI` interface and its implementations to pass ID token claims through a `*oidctypes.Token` return parameter rather than as a third return parameter.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This is just a more convenient copy of these values which are already stored inside the ID token. This will save us from having to pass them around seprately or re-parse them later.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
I'm worried that these errors are going to be really burried from the user, so
add some log statements to try to make them a tiny bit more observable.
Also follow some of our error message convetions by using lowercase error
messages.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This CSRF cookie needs to be included on the request to the callback endpoint triggered by the redirect from the OIDC upstream provider. This is not allowed by `Same-Site=Strict` but is allowed by `Same-Site=Lax` because it is a "cross-site top-level navigation" [1].
We didn't catch this earlier with our Dex-based tests because the upstream and downstream issuers were on the same parent domain `*.svc.cluster.local` so the cookie was allowed even with `Strict` mode.
[1]: https://tools.ietf.org/html/draft-ietf-httpbis-cookie-same-site-00#section-3.2
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This commit includes a failing test (amongst other compiler failures) for the
dynamic signing key fetcher that we will inject into fosite. We are checking it
in so that we can pass the WIP off.
Signed-off-by: Margo Crawford <margaretc@vmware.com>
We are currently using EC keys to sign ID tokens, so we should reflect that in
our OIDC discovery metadata.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This fixes a regression introduced by 24c4bc0dd4. It could occasionally cause the tests to fail when run on a machine with an IPv6 localhost interface. As a fix I added a wrapper for the new Go 1.15 `LookupIP()` method, and created a partially-functional backport for Go 1.14. This should be easy to delete in the future.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This adds a few new "create test object" helpers and extends `CreateTestOIDCProvider()` to optionally wait for the created OIDCProvider to enter some expected status condition.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This allows the token exchange request to be performed with the correct TLS configuration.
We go to a bit of extra work to make sure the `http.Client` object is cached between reconcile operations so that connection pooling works as expected.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- Note that this WIP commit includes a failing unit test, which will
be addressed in the next commit
Signed-off-by: Ryan Richard <richardry@vmware.com>
Mainly, avoid using some `testing` helpers that were added in 1.14, as well as a couple of other niceties we can live without.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
We were assuming that env.SupervisorHTTPAddress was set, but it might not be
depending on the environment on which the integration tests are being run. For
example, in our acceptance environments, we don't currently set
env.SupervisorHTTPAddress.
I tried to follow the pattern from TestSupervisorOIDCDiscovery here.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Generate a new cookie for the user and move on as if they had not sent
a bad cookie. Hopefully this will make the user experience better if,
for example, the server rotated cookie signing keys and then a user
submitted a very old cookie.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Also use ConstantTimeCompare() to compare CSRF tokens to prevent
leaking any information in how quickly we reject bad tokens.
Signed-off-by: Ryan Richard <richardry@vmware.com>
This is much nicer UX for an administrator installing a UpstreamOIDCProvider
CRD. They don't have to guess as hard at what the callback endpoint path should
be for their UpstreamOIDCProvider.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Also aggresively refactor for readability:
- Make helper validations functions for each type of storage
- Try to label symbols based on their downstream/upstream use and group them
accordingly
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Prior to this we re-used the CLI testing client to test the authorize flow of the supervisor, but they really need to be separate upstream clients. For example, the supervisor client should be a non-public client with a client secret and a different callback endpoint.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- Also handle several more error cases
- Move RequireTimeInDelta to shared testutils package so other tests
can also use it
- Move all of the oidc test helpers into a new oidc/oidctestutils
package to break a circular import dependency. The shared testutil
package can't depend on any of our other packages or else we
end up with circular dependencies.
- Lots more assertions about what was stored at the end of the
request to build confidence that we are going to pass all of the
right settings over to the token endpoint through the storage, and
also to avoid accidental regressions in that area in the future
Signed-off-by: Ryan Richard <richardry@vmware.com>
Also refactor to get rid of duplicate test structs.
Also also don't default groups ID token claim because there is no standard one.
Also also also add some logging that will hopefully help us in debugging in the
future.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Because we want it to implement an AuthcodeExchanger interface and
do it in a way that will be more unit test-friendly than the underlying
library that we intend to use inside its implementation.
This will allow it to be imported by Go code outside of our repository, which was something we have planned for since this code was written.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Before, we did this in an init container, which meant if the Dex pod restarted we would have fresh certs, but our Tilt/bash setup didn't account for this.
Now, the certs are generated by a Job which runs once and saves the generated files into a Secret. This should be a bit more stable.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This change deploys a small Squid-based proxy into the `dex` namespace in our integration test environment. This lets us use the cluster-local DNS name (`http://dex.dex.svc.cluster.local/dex`) as the OIDC issuer. It will make generating certificates easier, and most importantly it will mean that our CLI can see Dex at the same name/URL as the supervisor running inside the cluster.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- To better support having multiple downstream providers configured,
the authorize endpoint will share a CSRF cookie between all
downstream providers' authorize endpoints. The first time a
user's browser hits the authorize endpoint of any downstream
provider, that endpoint will set the cookie. Then if the user
starts an authorize flow with that same downstream provider or with
any other downstream provider which shares the same domain name
(i.e. differentiated by issuer path), then the same cookie will be
submitted and respected.
- Just in case we are sharing the domain name with some other app,
we sign the value of any new CSRF cookie and check the signature
when we receive the cookie. This wasn't strictly necessary since
we probably won't share a domain name with other apps, but it
wasn't hard to add this cookie signing.
Signed-off-by: Ryan Richard <richardry@vmware.com>
We want to have our APIs respond to `kubectl get pinniped`, and we shouldn't use `all` because we don't think most average users should have permission to see our API types, which means if we put our types there, they would get an error from `kubectl get all`.
I also added some tests to assert these properties on all `*.pinniped.dev` API resources.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This check predates the API renaming we did. Now that our API groups have `concierge`/`supervisor` in the name, we don't need to maintain a specific set of `cp` commands and keep them in sync, so we don't really need this check.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
These only really make sense for aggregated API types where we need `conversion-gen` to do version conversion.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Our unit tests are gonna touch a lot more corner cases than our
integration tests, so let's make them run as close to the real
implementation as possible.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This commit is meant to be reverted when we are unblocked and
ready to start working on this integration test again. Temporarily
remove it so we can merge this PR to main.
Note: I had tried using t.Skip() in the test, but then that caused lint
failures, so decided to just remove it for now.
We want to run all of the fosite validations in the authorize
endpoint, but we don't need to store anything yet because
we are storing what we need for later in the upstream state
parameter.
Signed-off-by: Ryan Richard <richardry@vmware.com>
This is helpful for us, amongst other users, because we want to enable "debug"
logging whenever we deploy components for testing.
See a5643e3 for addition of log level.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- Add a new helper method to plog to make a consistent way to log
expected errors at the info level (as opposed to unexpected
system errors that would be logged using plog.Error)
Signed-off-by: Ryan Richard <richardry@vmware.com>
This is awaiting the new upstream OIDC provider CRD in order
to pass, however hopefully this is a starting point for us.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Also move definition of our oauth client and the general fosite
configuration to a helper so we can use the same config to construct
the handler for both test and production code.
Signed-off-by: Ryan Richard <richardry@vmware.com>
This prevents unnecessary sync loop runs when the controller is
running with a single worker. When the controller is running with
more than one worker, it prevents subtle bugs that can cause the
controller to go "back in time."
Signed-off-by: Monis Khan <mok@vmware.com>
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Previously we were injecting the whole oauth handler chain into this function,
which meant we were essentially writing unit tests to test our tests. Let's push
some of this logic into the source code.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Does not validate incoming request parameters yet. Also is not
served on the http/https ports yet. Those will come in future commits.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This is needed on clusters with PodSecurityPolicy enabled by default, but should be harmless in other cases.
This is generally needed because a restrictive PodSecurityPolicy will usually otherwise prevent the `hostPath` volume mount needed by the dynamically-created cert agent pod.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This is the beginning of a change to add cpu/memory limits to our pods.
We are doing this because some consumers require this, and it is generally
a good practice.
The limits == requests for "Guaranteed" QoS.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
I tried to follow a principle of encapsulation here - we can still default to
peeps making connections to 80/443 on a Service object, but internally we will
use 8080/8443.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
And delete the agent pod when it needs its custom labels to be
updated, so that the creator controller will notice that it is missing
and immediately create it with the new custom labels.
This only matters for local development, since we don't use this script directly in CI. Makes the full codegen ste take ~90s on my laptop.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This is the first of a few related changes that re-organize our API after the big recent changes that introduced the supervisor component.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- Setting a Secret in the supervisor's namespace with a special name
will cause it to get picked up and served as the supervisor's TLS
cert for any request which does not have a matching SNI cert.
- This is especially useful for when there is no DNS record for an
issuer and the user will be accessing it via IP address. This
is not how we would expect it to be used in production, but it
might be useful for other cases.
- Includes a new integration test
- Also suppress all of the warnings about ignoring the error returned by
Close() in lines like `defer x.Close()` to make GoLand happier
- TLS certificates can be configured on the OIDCProviderConfig using
the `secretName` field.
- When listening for incoming TLS connections, choose the TLS cert
based on the SNI hostname of the incoming request.
- Because SNI hostname information on incoming requests does not include
the port number of the request, we add a validation that
OIDCProviderConfigs where the issuer hostnames (not including port
number) are the same must use the same `secretName`.
- Note that this approach does not yet support requests made to an
IP address instead of a hostname. Also note that `localhost` is
considered a hostname by SNI.
- Add port 443 as a container port to the pod spec.
- A new controller watches for TLS secrets and caches them in memory.
That same in-memory cache is used while servicing incoming connections
on the TLS port.
- Make it easy to configure both port 443 and/or port 80 for various
Service types using our ytt templates for the supervisor.
- When deploying to kind, add another nodeport and forward it to the
host on another port to expose our new HTTPS supervisor port to the
host.
- When two different Issuers have the same host (i.e. they differ
only by path) then they must have the same secretName. This is because
it wouldn't make sense for there to be two different TLS certificates
for one host. Find any that do not have the same secret name to
put an error status on them and to avoid serving OIDC endpoints for
them. The host comparison is case-insensitive.
- Issuer hostnames should be treated as case-insensitive, because
DNS hostnames are case-insensitive. So https://me.com and
https://mE.cOm are duplicate issuers. However, paths are
case-sensitive, so https://me.com/A and https://me.com/a are
different issuers. Fixed this in the issuer validations and in the
OIDC Manager's request router logic.
This was hidden behind a `pinniped alpha` hidden subcommand, but we're comfortable enough with the CLI flag interface now to promote it.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
When using kind we forward the node's port to the host, so we only
really care about the `nodePort` value. For acceptance clusters,
we put an Ingress in front of a NodePort Service, so we only really
care about the `port` value.
Based on our experiences today with GKE, it will be easier for our users
to configure Ingress health checks if the healthz endpoint is available
on the same public port as the OIDC endpoints.
Also add an integration test for the healthz endpoint now that it is
public.
Also add the optional `containers[].ports.containerPort` to the
supervisor Deployment because the GKE docs say that GKE will look
at that field while inferring how to invoke the health endpoint. See
https://cloud.google.com/kubernetes-engine/docs/concepts/ingress#def_inf_hc
- Not used by any of our integration test clusters yet
- Planning to use it later for the kind clusters and maybe for
the acceptance clusters too (although the acceptance clusters might
not need to use self-signed certs so maybe not)
- It didn't matter before because it would be cleaned up by a
t.Cleanup() function, but now that we might loop twice it will matter
during the second time through the loop
EC keys are smaller and take less time to generate. Our integration
tests were super flakey because generating an RSA key would take up to
10 seconds *gasp*. The main token verifier that we care about is
Kubernetes, which supports P256, so hopefully it won't be that much of
an issue that our default signing key type is EC. The OIDC spec seems
kinda squirmy when it comes to using non-RSA signing algorithms...
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
I brought this over because I copied code from work in flight on
another branch. But now the other branch doesn't use this package.
No use bringing on another dependency if we can avoid it.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- When using `local()` in the Tiltfile it will not know
to watch those files for changes, so each time we use
`local()` we now also use `watch_file()`
- As a result, editing a ytt template file now causes
an immediate `kubectl apply` of the results
- New optional ytt value called `into_namespace` means install into that
preexisting namespace rather than creating a new namespace for each app
- Also ensure that every resource that is created statically by our yaml
at install-time by either app is labeled consistently
- Also support adding custom labels to all of those resources from a
new ytt value called `custom_labels`
This will be used for other types of "capabilities" of the test environment besides just those of the test cluster, such as those of an upstream OIDC provider.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Based on the spec, it seems like it's required that OAuth2 servers which do not support PKCE should just ignore the parameters, so this should always work.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Add install-pinniped-supervisor.yaml and rename install-pinniped.yaml
to install-pinniped-concierge.yaml in the release process and
installation/demo documentation.
- Tiltfile and prepare-for-integration-tests.sh both specify the
NodePort Service using `--data-value-yaml 'service_nodeport_port=31234'`
- Also rename the namespaces used by the Concierge and Supervisor apps
during integration tests running locally
- Also continue renaming things related to the concierge app
- Enhance the uninstall test to also test uninstalling the supervisor
and local-user-authenticator apps
- Variables specific to concierge add it to their name
- All variables now start with `PINNIPED_TEST_` which makes it clear
that they are for tests and also helps them not conflict with the
env vars that are used in the Pinniped CLI code
- The OIDCProviderConfigWatcherController synchronizes the
OIDCProviderConfig settings to dynamically mount and unmount the
OIDC discovery endpoints for each provider
- Integration test passes but unit tests need to be added still
- Intended to be a red test in this commit; will make it go
green in a future commit
- Enhance env.go and prepare-for-integration-tests.sh to make it
possible to write integration tests for the supervisor app
by setting more env vars and by exposing the service to the kind
host on a localhost port
- Add `--clean` option to prepare-for-integration-tests.sh
to make it easier to start fresh
- Make prepare-for-integration-tests.sh advise you to run
`go test -v -count 1 ./test/integration` because this does
not buffer the test output
- Make concierge_api_discovery_test.go pass by adding expectations
for the new OIDCProviderConfig type
- Note that this avoids committing the demo screencast
file to our git history because it is 5.76 MB. We won't
want to need to download that content on
every `git clone`.
- Instead the file is hosted by GitHub's CDN
This will hopefully come in handy later if we ever decide to add
support for multiple OIDC providers as a part of one supervisor.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- Moves the "front matter" to a Markdown comment so you don't necessarily have to delete it.
- Reduces a little bit of boilerplate (this is a bit subjective).
- Tweaks some formatting (also subjective).
- Describe what happens when you use "Fixes [...]".
- Tweak the release note block so it should be easier to parse out automatically (using the same syntax as Kubernetes).
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This needs to be overridden for Tilt usage, since the main image referenced by Tilt isn't actually pullable.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- Prevent the server binary from lying about its version number
by having it report "?.?.?" as its version number for now.
- Later we can devise a way for CI to inject the version number
for the server into the container image at release time,
not at build time, since the version number is not known
at build time.
- Pre-release builds of the binary from before the release stage or
builds on developer workstation will also report "?.?.?" as its
version number, which is fine since they are not official releases
and shouldn't find their way to the public.
I saw this helper function the other day and wondered if we could use it.
It does indeed look like it does what we want, because when I run this code,
I get `...User "system:anonymous" cannot get resource...`.
c := library.NewAnonymousPinnipedClientset(t)
_, err := c.
ConfigV1alpha1().
CredentialIssuerConfigs("integration").
Get(context.Background(), "pinniped-config", metav1.GetOptions{})
t.Log(err)
I also ran a similar test using this new helper in the context of
library.NewClientsetWithCertAndKey(). Seemed to get us what we want.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This change replaces our previous test helpers for checking cluster capabilities and passing external test parameters. Prior to this change, we always used `$PINNIPED_*` environment variables and these variables were accessed throughout the test code.
The new code introduces a more strongly-typed `TestEnv` structure and helpers which load and expose the parameters. Tests can now call `env := library.IntegrationEnv(t)`, then access parameters such as `env.Namespace` or `env.TestUser.Token`. This should make this data dependency easier to manage and refactor in the future. In many ways this is just an extended version of the previous cluster capabilities YAML.
Tests can also check for cluster capabilities easily by using `env := library.IntegrationEnv(t).WithCapability(xyz)`.
The actual parameters are still loaded from OS environment variables by default (for compatibility), but the code now also tries to load the data from a Kubernetes Secret (`integration/pinniped-test-env` by default). I'm hoping this will be a more convenient way to pass data between various scripts than the local `/tmp` directory. I hope to remove the OS environment code in a future commit.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This should fix integration tests running on clusters that don't have
visible controller manager pods (e.g., GKE). Pinniped should boot, not
find any controller manager pods, but still post a status in the CIC.
I also updated a test helper so that we could tell the difference
between when an event was not added and when an event was added with
an empty key.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Right now in the YTT templates we assume that the agent pods are gonna use
the same image as the main Pinniped deployment, so we can use the same logic
for the image pull secrets.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This seems to fail on CI when the Concourse workers get slow and
kind stops working reliably. It would be interesting to see the
error message in that case to figure out if there's anything we
could do to make the test more resilient.
Simplifies the implementation, makes it more consistent with other
updaters of the cic (CredentialIssuerConfig), and also retries on
update conflicts
Signed-off-by: Ryan Richard <richardry@vmware.com>
- Only inject things through the constructor that the controller
will need
- Use pkg private constants when possible for things that are not
actually configurable by the user
- Make the agent pod template private to the pkg
- Introduce a test helper to reduce some duplicated test code
- Remove some `it.Focus` lines that were accidentally committed, and
repair the broken tests that they were hiding
I think we want to reconcile on these pod template fields so that if
someone were to redeploy Pinniped with a new image for the agent, the
agent would get updated immediately. Before this change, the agent image
wouldn't get updated until the agent pod was deleted.
This thing is supposed to be used to help our CredentialRequest handler issue certs with a dynamic
CA keypair.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
3 main reasons:
- The cert and key that we store in this object are not always used for TLS.
- The package name "provider" was a little too generic.
- dynamiccert.Provider reads more go-ish than provider.DynamicCertProvider.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- Lots of TODOs added that need to be resolved to finish this WIP
- execer_test.go seems like it should be passing, but it fails (sigh)
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Annotations do not have this restriction, so we can put it there instead. This only currently occurs on clusters without the cluster signing capability (GKE).
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This also has fallback compatibility support if no IDP is specified and there is exactly one IDP in the cache.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- All of the `kubecertagent` controllers now take two informers
- This is moving in the direction of creating the agent pods in the
Pinniped installation namespace, but that will come in a future
commit
New resource naming conventions:
- Do not repeat the Kind in the name,
e.g. do not call it foo-cluster-role-binding, just call it foo
- Names will generally start with a prefix to identify our component,
so when a user lists all objects of that kind, they can tell to which
component it is related,
e.g. `kubectl get configmaps` would list one named "pinniped-config"
- It should be possible for an operator to make the word "pinniped"
mostly disappear if they choose, by specifying the app_name in
values.yaml, to the extent that is practical (but not from APIService
names because those are hardcoded in golang)
- Each role/clusterrole and its corresponding binding have the same name
- Pinniped resource names that must be known by the server golang code
are passed to the code at run time via ConfigMap, rather than
hardcoded in the golang code. This also allows them to be prepended
with the app_name from values.yaml while creating the ConfigMap.
- Since the CLI `get-kubeconfig` command cannot guess the name of the
CredentialIssuerConfig resource in advance anymore, it lists all
CredentialIssuerConfig in the app's namespace and returns an error
if there is not exactly one found, and then uses that one regardless
of its name
Eventually we could refactor to remove support for the old APIs, but they are so similar that a single implementation seems to handle both easily.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This is essentially meant to be be "v1alpha2" of the existing CredentialRequest API, but since we want to move API groups we can just start over at v1alpha1.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
We want to be able to run kind integration tests against the same
versions that we generate code against. There is no public
kindest/node image for 1.17.9, so let's update to the next 1.17.x
version where there is an image: 1.17.11.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This was previously using the unpadded (raw) base64 encoder, which worked sometimes (if the CA happened to be a length that didn't require padding). The correct encoding is the `base64.StdEncoding` one that includes padding.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
The dry-run fails now because we are trying to install a CRD and a custom resource (of that CRD type) in the same step.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
I tried to follow the following principles.
- Use existing wordsmithing.
- Only document things that we support today.
- Grab our README.md reader's attention using a picture.
- Use "upstream" when referring to OSS and "external" when referring to
IDP integrations.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- Add flag parsing and help messages for root command,
`exchange-credential` subcommand, and new `get-kubeconfig` subcommand
- The new `get-kubeconfig` subcommand is a work in progress in this
commit
- Also add here.Doc() and here.Docf() to enable nice heredocs in
our code
So I looked into other TokenReview webhook implementations, and most
of them just use the json stdlib package to unmarshal/marshal
TokenReview payloads. I'd say let's follow that pattern, even though
it leads to extra fields in the JSON payload (these are not harmful).
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- Also correct the webhook url setting in prepare-for-integration-tests.sh
- Change the bcrypt count to 10, because 16 is way too slow on old laptops
Signed-off-by: Ryan Richard <richardry@vmware.com>
I also started updating the script to deploy the test-webhook instead of
doing TMC stuff. I think the script should live in this repo so that
Pinniped contributors only need to worry about one repo for running
integration tests.
There are a bunch of TODOs in the script, but I figured this was a good
checkpoint. The script successfully runs on my machine and sets up the
test-webhook and pinniped on a local kind cluster. The integration tests
are failing because of some issue with pinniped talking to the test-webhook,
but this is step in the right direction.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This script was basically an alias for `./hack/module.sh unittest`. We even
tell people to run the unit tests via module.sh in our contributing doc.
Let's ditch it - the best line of (shell code) is the one you don't write.
An analagous change was made in CI to use module.sh in place of test-unit.sh.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- For now, build the test-webhook binary in the same container image as
the pinniped-server binary, to make it easier to distribute
- Also fix lots of bugs from the first draft of the test-webhook's
`/authenticate` implementation from the previous commit
- Add a detailed README for the new deploy-test-webhook directory
The webhook still needs to be updated to auto generate its
certificates.
We decided not to give this webhook its own go module for now since
this webhook only pulled in one more dependency, and it is a
dependency that we will most likely need in the future.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- The certs manager controller, along with its sibling certs expirer
and certs observer controllers, are generally useful for any process
that wants to create its own CA and TLS certs, but only if the
updating of the APIService is not included in those controllers
- So that functionality for updating APIServices is moved to a new
controller which watches the same Secret which is used by those
other controllers
- Also parameterize `NewCertsManagerController` with the service name
and the CA common name to make the controller more reusable
- We are not setting an upper limit because Kubernetes might randomly
decide to unschedule our pod in ways that we can't anticipate in
advance, causing very hard to reproduce production bugs.
- We noticed that our app currently uses ~30 MB of memory when idle,
and ~35 MB of memory under some load. So a memory request of 128
MB should be reasonable.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
When we use RSA private keys to sign our test certificates, we run
into strange test timeouts. The internal/controller/apicerts package
was timing out on my machine more than once every 3 runs. When I
changed the RSA crypto to EC crypto, this timeout goes away. I'm not
gonna try to figure out what the deal is here because I think it would
take longer than it would be worth (although I am sure it is some fun
story involving prime numbers; the goroutine traces for timed out
tests would always include some big.Int operations involving prime
numbers...).
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
It looks like requests to our aggregated API service on GKE vacillate
between success and failure until they reach a converged successful
state. I think this has to do with our pods updating the API serving
cert at different times. If only one pod updates its serving cert to
the correct value, then it should respond with success. However, the
other pod would respond with failure. Depending on the load balancing
algorithm that GKE uses to send traffic to pods in a service, we could
end up with a success that we interpret as "all pods have rotated
their certs" when it really just means "at least one pod has rotated
its certs."
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- Controllers will automatically run again when there's an error,
but when we want to update CredentialIssuerConfig from server.go
we should be careful to retry on conflicts
- Add unit tests for `issuerconfig.CreateOrUpdateCredentialIssuerConfig()`
which was covered by integration tests in previous commits, but not
covered by units tests yet.
So that operators won't look at the lifetime of the CA cert and be
like, "wtf, why does the serving cert have the lifetime that I
specified, but its CA cert is valid for 100 years".
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- Upgrade from `1.19.0-rc.0` to the newly-release `1.19.0`.
- Downgrade from `1.18.6` to `1.18.2` to match some downstream consumers.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This should simplify our build/test setup quite a bit, since it means we have only a single module (at the top level) with all hand-written code. I'll leave `module.sh` alone for now but we may be able to simplify that a bit more.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
We were using this at one point to control which tests ran with `go test ./...`, but now we're also using the `-short` flag to differentiate unit vs. integration tests.
Hopefully this will simplify things a bit.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
kubectl pulls these in in their main package...I wonder if we should do
the same for our main packages?
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- Controller and aggregated API server are allowed to run
- Keep retrying to borrow the cluster signing key in case the failure
to get it was caused by a transient failure
- The CredentialRequest endpoint will always return an authentication
failure as long as the cluster signing key cannot be borrowed
- Update which integration tests are skipped to reflect what should
and should not work based on the cluster's capability under this
new behavior
- Move CreateOrUpdateCredentialIssuerConfig() and related methods
to their own file
- Update the CredentialIssuerConfig's Status every time we try to
refresh the cluster signing key
- Indicate the success or failure of the cluster signing key strategy
- Also introduce the concept of "capabilities" of an integration test
cluster to allow the integration tests to be run against clusters
that do or don't allow the borrowing of the cluster signing key
- Tests that are not expected to pass on clusters that lack the
borrowing of the signing key capability are now ignored by
calling the new library.SkipUnlessClusterHasCapability test helper
- Rename library.Getenv to library.GetEnv
- Add copyrights where they were missing
These never worked quite right, so let's disable them for now: #51
We can probably come up with some better solution now with the new codegen scripts, but I'll leave that for later.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
We are seeing between 1 and 2 minutes of difference between the current time
reported in the API server pod and the pinniped pods on one of our testing
environments. Hopefully this change makes our tests pass again.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Didn't fix CI. I didn't think it would.
I have never seen the integration tests fail like this locally, so I
have to imagine the failure has something to do with the environment
on which we are testing.
This reverts commit ba2e2f509a.
We are getting these weird flakes in CI where the kube client that we
create with these helper functions doesn't work against the kube API.
The kube API tells us that we are unauthorized (401). Seems like something
is wrong with the keypair itself, but when I create a one-off kubeconfig
with the keypair, I get 200s from the API. Hmmm...I wonder what CI will
think of this change?
I also tried to align some naming in this package.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- Makes it easier to guess/remember what are the legal arguments
- Also update the output a little to make it easier to tell
when the command has succeeded
- And run tests using `-count 1` because cached test results are not
very trustworthy
These configuration knobs are much more human-understandable than the
previous percentage-based threshold flag.
We now allow users to set the lifetime of the serving cert via a ConfigMap.
Previously this was hardcoded to 1 year.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
We need a way to validate that this generated code is up to date. I added
a long-term engineering TODO for this.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
The rotation is forced by a new controller that deletes the serving cert
secret, as other controllers will see this deletion and ensure that a new
serving cert is created.
Note that the integration tests now have an addition worst case runtime of
60 seconds. This is because of the way that the aggregated API server code
reloads certificates. We will fix this in a future story. Then, the
integration tests should hopefully get much faster.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This switches us back to an approach where we use the Pod "exec" API to grab the keys we need, rather than forcing our code to run on the control plane node. It will help us fail gracefully (or dynamically switch to alternate implementations) when the cluster is not self-hosted.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Co-authored-by: Ryan Richard <richardry@vmware.com>
We don't want people to run codegen.sh directly, because it is meant
to be driven by hack/module.sh. To discourage this behavior, we will hide
codegen.sh away in hack/lib. I don't think this is actually what the
hack/lib directory is for, though...meh.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Runs code generation on a per-module basis. If `CONTAINED` is not set
the code generation is run in a container.
Mount point in docker is randomzied to simulate Concourse.
Introduce K8S_PKG_VERSION to make room to build different versions
eventually.
- Call the auto-generated /healthz endpoint of our aggregated API server
- Use http for liveness even though tcp seems like it might be
more appropriate, because tcp probes cause TLS handshake errors
to appear in our logs every few seconds
- Use conservative timeouts and retries on the liveness probe to avoid
having our container get restarted when it is temporarily slow due
to running in an environment under resource pressure
- Use less conservative timeouts and retries for the readiness probe
to remove an unhealthy pod from the service less conservatively than
restarting the container
- Tuning the settings for retries and timeouts seem to be a mysterious
art, so these are just a first draft
- It would sometimes fail with this error:
namespaces is forbidden: User "tanzu-user-authentication@groups.vmware.com"
cannot list resource "namespaces" in API group "" at the cluster scope
- Seems like it was because the RBAC rule added by the test needs a
moment before it starts to take effect, so change the test to retry
the API until it succeeds or fail after 3 seconds of trying.
- We want to follow the <noun>Request convention.
- The actual operation does not login a user, but it does retrieve a
credential with which they can login.
- This commit includes changes to all LoginRequest-related symbols and
constants to try to update their names to follow the new
CredentialRequest type.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
As discussed in API review, this field exists for convenience right
now. Since the username/groups are encoded in the Credential sent in
the LoginRequestStatus, the client still has access to their
user/groups information. We want to remove this for now to be
conservative and limit our API surface area (smaller surface area =
less to maintain). We can always add this back in the future.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- When we call the LoginRequest endpoint in loginrequest_test.go,
do it with an unauthenticated client, to make sure that endpoint works
with unauthenticated clients.
- For tests which want to test using certs returned by LoginRequest to
make API calls back to kube to check if those certs are working, make
sure they start with a bare client and then add only those certs.
Avoid accidentally picking up other kubeconfig configuration like
tokens, etc.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
find(1) seems to look at directory entries in the order in which they exist
in the directory fs entry. Let's sort these so that we get the same results
regardless of the order of the directory entries.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- For high availability reasons, we would like our app to scale linearly
with the size of the control plane. Using a DaemonSet allows us to run
one pod on each node-role.kubernetes.io/master node.
- The hope is that the Service that we create should load balance
between these pods appropriately.
Wow fun times with symlinks. We *think* this script should work in CI
now...but we'll see.
Previously we were seeing a false positive where even though the generated
code was out of date, the CI step did not report failure.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- Add integration test for serving cert auto-generation and rotation
- Add unit test for `WithInitialEvent` of the cert manager controller
- Move UpdateAPIService() into the `apicerts` package, since that is
the only user of the function.
- Add a unit test for each cert controller
- Make DynamicTLSServingCertProvider an interface and use a mutex
internally
- Create a shared ToPEM function instead of having two very similar
functions
- Move the ObservableWithInformerOption test helper to testutils
- Rename some variables and imports
- Refactors the existing cert generation code into controllers
which read and write a Secret containing the certs
- Does not add any new functionality yet, e.g. no new handling
for cert expiration, and no leader election to allow for
multiple servers running simultaneously
- This commit also doesn't add new tests for the cert generation
code, but it should be more unit testable now as controllers
- No functional changes
- Move all the stuff about clients and controllers into the controller
package
- Add more comments and organize the code more into more helper
functions to make each function smaller
I suppose we could solve this other ways, but this utility was
only used in one place right now, so it is easiest to copy it over.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- Not strictly necessary at the moment because both our build layer
and our run layer are based on alpine, but static linking our binary
will help us later when we want to base our run image on something
closer to scratch
Instead, make the integration tests a separate module. You can't run
these tests by accident because they will not run at all when you
`go test` from the top-level directory. You will need to `cd test`
before using `go test` in order to run the integration tests.
Signed-off-by: Ryan Richard <richardry@vmware.com>
- Previously the golang code would create a Service and an APIService.
The APIService would be given an owner reference which pointed to
the namespace in which the app was installed.
- This prevented the app from being uninstalled. The namespace would
refuse to delete, so `kapp delete` or `kubectl delete` would fail.
- The new approach is to statically define the Service and an APIService
in the deployment.yaml, except for the caBundle of the APIService.
Then the golang code will perform an update to add the caBundle at
runtime.
- When the user uses `kapp deploy` or `kubectl apply` either tool will
notice that the caBundle is not declared in the yaml and will
therefore avoid editing that field.
- When the user uses `kapp delete` or `kubectl delete` either tool will
destroy the objects because they are statically declared with names
in the yaml, just like all of the other objects. There are no
ownerReferences used, so nothing should prevent the namespace from
being deleted.
- This approach also allows us to have less golang code to maintain.
- In the future, if our golang controllers want to dynamically add
an Ingress or other objects, they can still do that. An Ingress
would point to our statically defined Service as its backend.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- We are temporarily adding this change on the main branch so that CI works
with the main branch and we can iterate on our changes on our PR branch.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- Why? Because the discovery URL is already there in the kubeconfig; let's
not make our lives more complicated by passing it in via an env var.
- Also allow for ytt callers to not specify data.values.discovery_url - there
are going to be a non-trivial number of installers of placeholder-name
that want to use the server URL found in the cluster-info ConfigMap.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- Seems like the next step is to allow override of the CA bundle; I didn't
do that here for simplicity of the commit, but seems like it is the right
thing to do in the future.
- Also includes bumping the api and client-go dependencies to the newer
version which also moved LoginDiscoveryConfig to the
crds.placeholder.suzerain-io.github.io group in the generated code
Our unit test command is going to get slighly more complex in a future revision. This should let us avoid having to sync the CI pipeline definition so many times.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- Also, don't repeat `spec.Parallel()` because, according to the docs
for the spec package, "options are inherited by subgroups and subspecs"
- Two tests are left pending to be filled in on the next commit
- `Before` gives a nice place to call `require.New(t)` to make the assertion lines more terse
- Just delete the keys for testing when env vars are missing
This is kind of a subtle bug, but we were using the unversioned Kubernetes type package here, where we should have been using the v1beta1 version. They have the same fields, but they serialize to JSON differently.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
The type signatures of these methods make them easy to mix up. `require.Error()` asserts that there is any non-nil error -- the last parameter is an optional human-readable message to log when the assertion fails. `require.EqualError()` asserts that there is a non-nil error _and_ that when you call `err.Error()`, the string matches the expected value. It also takes an additional optional parameter to specify the log message.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Again, no idea why but this word has two commonly accepted spelling and Go code seems to very consistently use the one with one "l".
Signed-off-by: Matt Moyer <moyerm@vmware.com>
The error was:
```
internal/certauthority/certauthority.go:68:15: err113: do not define dynamic errors, use wrapped static errors instead: "fmt.Errorf(\"expected CA to be a single certificate, found %d certificates\", certCount)" (goerr113)
return nil, fmt.Errorf("expected CA to be a single certificate, found %d certificates", certCount)
^
exit status 1
```
I'm not sure if I love this err113 linter.
It turns out these fields are not meant to be base64 encoded, even though that's how they are in the kubeconfig.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
I think we may still split this apart into multiple packages, but for now it works pretty well in both use cases.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- Dynamically grant RBAC permission to the test user to allow them
to make read requests via the API
- Then use the credential returned from the LoginRequest to make a
request back to the API server which should be successful
This is a somewhat more basic way to get access to the certificate and private key we need to issue short lived certificates.
The host path, tolerations, and node selector here should work on any kubeadm-derived cluster including TKG-S and Kind.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This will make manual testing easier and seems like a reasonable tradeoff. We'll iterate more in the future.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
The error was:
```
internal/certauthority/certauthority.go:68:15: err113: do not define dynamic errors, use wrapped static errors instead: "fmt.Errorf(\"expected CA to be a single certificate, found %d certificates\", certCount)" (goerr113)
return nil, fmt.Errorf("expected CA to be a single certificate, found %d certificates", certCount)
^
exit status 1
```
I'm not sure if I love this err113 linter.
It turns out these fields are not meant to be base64 encoded, even though that's how they are in the kubeconfig.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
I think we may still split this apart into multiple packages, but for now it works pretty well in both use cases.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- Dynamically grant RBAC permission to the test user to allow them
to make read requests via the API
- Then use the credential returned from the LoginRequest to make a
request back to the API server which should be successful
This is a somewhat more basic way to get access to the certificate and private key we need to issue short lived certificates.
The host path, tolerations, and node selector here should work on any kubeadm-derived cluster including TKG-S and Kind.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- Mostly testing the way that the validation webhooks are called
- Also error when the auth webhook does not return user info, since we wouldn't know who you are in that case
- Also fix mistakes in the deployment.yaml
- Also hardcode the ownerRef kind and version because otherwise we get an error
Signed-off-by: Monis Khan <mok@vmware.com>
- Users may want to consume pkg/config to generate configuration files.
- This also involved putting config-related utilities in the config
package for ease of consumption.
- We did not add in versioning into the Config type for now...this is
something we will likely do in the future, but it is not deemed
necessary this early in the project.
- The config file format tries to follow the patterns of Kube. One such
example of this is requiring the use of base64-encoded CA bundle PEM
bytes instead of a file path. This also slightly simplifies the config
file handling because we don't have to 1) read in a file or 2) deal
with the error case of the file not being there.
- The webhook code from k8s.io/apiserver is really exactly what we want
here. If this dependency gets too burdensome, we can always drop it,
but the pros outweigh the cons at the moment.
- Writing out a kubeconfig to disk to configure the webhook is a little
janky, but hopefully this won't hurt performance too much in the year
2020.
- Also bonus: call the right *Serve*() function when starting our
servers.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- Trying to use "placeholder-name" or "placeholder_name" everywhere
that should later be changed to the actual name of the product,
so we can just do a simple search and replace when we have a name.
Below is a list of solutions where Pinniped is being used as a component.
**[Kubeapps](https://kubeapps.com/)**
Kubeapps uses Pinniped to [enable SSO authentication](https://github.com/kubeapps/kubeapps/blob/master/docs/user/using-an-OIDC-provider-with-pinniped.md) when running on clusters where SSO cannot be configured for the cluster API server.
TKG uses Pinniped to provide a seamless SSO experience across management and workload clusters.
**[VMware Tanzu Mission Control (TMC)](https://tanzu.vmware.com/mission-control)**
TMC uses Pinniped to provide a uniform authentication experience across all attached clusters.
## Adding your organization to the list of adopters
If you are using Pinniped and would like to be included in the list of Pinniped Adopters, add an SVG version of your logo that is less than 150 KB to
the [img directory](https://github.com/vmware-tanzu/pinniped/tree/main/site/themes/pinniped/static/img) in this repo and submit a pull request with your change including 1-2 sentences describing how your organization is using Pinniped. Name the image file something that
reflects your company (e.g., if your company is called Acme, name the image acme.svg). Please feel free to send us a message in [#pinniped](https://kubernetes.slack.com/archives/C01BW364RJA) with any questions you may have.
Please see https://github.com/vmware/pinniped/blob/main/CODE_OF_CONDUCT.md
# Contributor Covenant Code of Conduct
## Our Pledge
We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.
## Our Standards
Examples of behavior that contributes to a positive environment for our community include:
* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience
* Focusing on what is best not just for us as individuals, but for the overall community
Examples of unacceptable behavior include:
* The use of sexualized language or imagery, and sexual attention or
advances of any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email
address, without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
## Enforcement Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.
Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate.
## Scope
This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at [oss-coc@vmware.com](mailto:oss-coc@vmware.com). All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the reporter of any incident.
## Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct:
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested.
### 2. Warning
**Community Impact**: A violation through a single incident or series of actions.
**Consequence**: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban.
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards, including sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction within the community.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 2.0,
available at https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.
Community Impact Guidelines were inspired by [Mozilla's code of conduct enforcement ladder](https://github.com/mozilla/diversity).
[homepage]: https://www.contributor-covenant.org
For answers to common questions about this code of conduct, see the FAQ at
https://www.contributor-covenant.org/faq. Translations are available at https://www.contributor-covenant.org/translations.
Please see https://github.com/vmware/pinniped/blob/main/CONTRIBUTING.md
# Contributing to Pinniped
Contributions to Pinniped are welcome. Here are some things to help you get started.
## Code of Conduct
Please see the [Code of Conduct](./CODE_OF_CONDUCT.md).
## Project Scope
See [SCOPE.md](./SCOPE.md) for some guidelines about what we consider in and out of scope for Pinniped.
## Community Meetings
Pinniped is better because of our contributors and maintainers. It is because of you that we can bring great software to the community. Please join us during our online community meetings, occuring every first and third Thursday of the month at 9AM PT / 12PM PT. Use [this Zoom Link](https://vmware.zoom.us/j/93798188973?pwd=T3pIMWxReEQvcWljNm1admRoZTFSZz09) to attend and add any agenda items you wish to discuss to [the notes document](https://hackmd.io/rd_kVJhjQfOvfAWzK8A3tQ?view). Join our [Google Group](https://groups.google.com/u/1/g/project-pinniped) to receive invites to this meeting.
If the meeting day falls on a US holiday, please consider that occurrence of the meeting to be canceled.
## Discussion
Got a question, comment, or idea? Please don't hesitate to reach out via the GitHub [Discussions](https://github.com/vmware-tanzu/pinniped/discussions) tab at the top of this page or reach out in Kubernetes Slack Workspace within the [#pinniped channel](https://kubernetes.slack.com/archives/C01BW364RJA).
## Issues
Need an idea for a project to get started contributing? Take a look at the open
and implements different integration strategies for various Kubernetes
distributions to make authentication possible.
The Pinniped Concierge can be configured to hook into the Pinniped Supervisor's
federated credentials, or it can authenticate users directly via external IDP
credentials.
To learn more, see [architecture](https://pinniped.dev/docs/background/architecture/).
## Getting started with Pinniped
Care to kick the tires? It's easy to [install and try Pinniped](https://pinniped.dev/docs/demo/).
## Community meetings
Pinniped is better because of our contributors and maintainers. It is because of you that we can bring great software to the community. Please join us during our online community meetings, occurring every first and third Thursday of the month at 9 AM PT / 12 PM PT. Use [this Zoom Link](https://vmware.zoom.us/j/93798188973?pwd=T3pIMWxReEQvcWljNm1admRoZTFSZz09) to attend and add any agenda items you wish to discuss to [the notes document](https://hackmd.io/rd_kVJhjQfOvfAWzK8A3tQ?view). Join our [Google Group](https://groups.google.com/g/project-pinniped) to receive invites to this meeting.
If the meeting day falls on a US holiday, please consider that occurrence of the meeting to be canceled.
## Discussion
Got a question, comment, or idea? Please don't hesitate to reach out via the GitHub [Discussions](https://github.com/vmware-tanzu/pinniped/discussions) tab at the top of this page or reach out in Kubernetes Slack Workspace within the [#pinniped channel](https://kubernetes.slack.com/archives/C01BW364RJA).
## Contributions
Contributions are welcome. Before contributing, please see the [contributing guide](CONTRIBUTING.md).
## Reporting security vulnerabilities
Please follow the procedure described in [SECURITY.md](https://github.com/vmware/pinniped/blob/main/SECURITY.md).
## Creating a release
When the team is preparing to ship a release, a maintainer will create a new
GitHub [Issue](https://github.com/vmware/pinniped/issues/new/choose) in this repo to
collaboratively track progress on the release checklist. As tasks are completed,
the team will check them off. When all the tasks are completed, the issue is closed.
The release checklist is committed to this repo as an [issue template](https://github.com/vmware/pinniped/tree/main/.github/ISSUE_TEMPLATE/release_checklist.md).
## Pipelines
Pinniped uses [Concourse](https://concourse-ci.org) for CI/CD.
We are currently running our Concourse on a network that can only be reached from inside the corporate network at [ci.pinniped.broadcom.net](https://ci.pinniped.broadcom.net).
The following pipelines are implemented in this branch. Not all pipelines are necessarily publicly visible, although our goal is to make them all visible.
-`main`
This is the main pipeline that runs on merges to `main`. It builds, tests, and (when manually triggered) releases from main.
-`pull-requests`
This is a pipeline that triggers for each open pull request. It runs a smaller subset of the integration tests and validations as `pinniped`.
-`dockerfile-builders`
This pipeline builds a bunch of custom utility container images that are used in our CI and testing.
-`build-gi-cli` (a container image that includes the GitHub CLI)
-`build-github-pr-resource` (a [fork](https://github.com/pinniped-ci-bot/github-pr-resource) of the `github-pr-resource` with support for gating PRs for untrusted users)
-`build-code-coverage-uploader` (uploading code coverage during unit tests)
-`build-eks-deployer-dockerfile` (deploying our app to EKS clusters)
-`build-k8s-app-deployer-dockerfile` (deploying our app to clusters)
-`build-pool-trigger-resource-dockerfile` (an updated implementation of the [pool-trigger-resource](https://github.com/cfmobile/pool-trigger-resource) for use in our CI)
-`build-integration-test-runner-beta-dockerfile` (running our integration tests with the latest Chrome beta version)
-`build-deployment-yaml-formatter-dockerfile` (templating our deployment YAML during a release)
-`build-crane` (copy and tag container images during release)
-`build-k8s-code-generator-*` (running our Kubernetes code generation under different Kubernetes dependency versions)
-`build-test-dex` (a Dex used during tests)
-`build-test-cfssl` (a cfssl used during tests)
-`build-test-kubectl` (a kubectl used during tests)
-`build-test-forward-proxy` (a Squid forward proxy used during tests)
-`build-test-bitnami-ldap` (an OpenLDAP used during tests)
-`cleanup-aws`
This runs a script that runs [aws-nuke](https://github.com/rebuy-de/aws-nuke) against our test AWS account.
This was occasionally needed because [eksctl](https://eksctl.io/) sometimes fails and leaks AWS resources. These resources cost money and use up our AWS quota.
However, we seem to have worked around these issues and this pipeline has not been used for some time.
These jobs are only triggered manually. This is dangerous and should be used with care.
-`concourse-workers`
Deploys worker replicas on a long-lived GKE cluster that runs the Concourse workers, and can scale them up or down.
-`go-compatibility`
This pipeline runs nightly jobs that validate the compatibility of our code as a Go module in various contexts. We have jobs that test that our code compiles under older Go versions and that our CLI can be installed using `go install`.
-`security-scan`
This pipeline has nightly jobs that run security scans on our current main branch and most recently released artifacts.
The tools we use are:
- [sonatype-nexus-community/nancy](https://github.com/sonatype-nexus-community/nancy), which scans Go module versions.
- [aquasecurity/trivy](https://github.com/aquasecurity/trivy), which scans container images and Go binaries.
- [govulncheck](https://pkg.go.dev/golang.org/x/vuln/cmd/govulncheck), which scans Go code to find calls to known-vulnerable dependencies.
This pipeline also has a job called `all-golang-deps-updated` which automatically submits PRs to update all
direct dependencies in Pinniped's go.mod file, and update the Golang and distroless container images used in
Pinniped's Dockerfiles.
-`kind-node-builder`
A nightly build job which uses the latest version of kind to build the HEAD of master of Kubernetes as a container
image that can be used to deploy kind clusters. Other pipelines use this container image to install Pinniped and run
integration tests. This gives us insight in any compatibility problems with the upcoming next release of Kubernetes.
## Deploying pipeline changes
After any shared tasks (`./pipelines/shared-tasks`) or helpers (`./pipelines/shared-helpers`) are edited,
the commits must be pushed to the `ci` branch of this repository to take effect.
After editing any CI secrets or pipeline definitions, a maintainer must run the corresponding
`./pipelines/$PIPELINE_NAME/update-pipeline.sh` script to apply the changes to Concourse.
To deploy _all_ pipelines, a maintainer can run `./pipelines/update-all-pipelines.sh`.
Don't forget to commit and push your changes after applying them!
## Github webhooks for pipelines
Some pipelines use github [webhooks to trigger resource checks](https://concourse-ci.org/resources.html#schema.resource.webhook_token),
rather than the default of polling every minute, to make these pipelines more responsive and use fewer compute resources
for running checks. Refer to places where `webhook_token` is configured in various `pipeline.yml` files.
To make these webhooks work, they must be defined on the [GitHub repo's settings](https://github.com/vmware/pinniped/settings/hooks).
## Installing and operating Concourse
See [infra/README.md](./infra/README.md) for details about how Concourse was installed and how it can be operated.
## Acceptance environments
In addition to the many ephemeral Kubernetes clusters we use for testing, we also deploy a long-running acceptance environment.
Google Kubernetes Engine (GKE) in the `gke-acceptance-cluster` cluster in our GCP project in the `us-west1-c` availability zone.
To access this cluster, download the kubeconfig to `gke-acceptance.yaml` by running:
This document provides a link to the[ Pinniped Project issues](https://github.com/vmware-tanzu/pinniped/issues) list that serves as the up to date description of items that are in the Pinniped release pipeline. Most items are gathered from the community or include a feedback loop with the community. This should serve as a reference point for Pinniped users and contributors to understand where the project is heading, and help determine if a contribution could be conflicting with a longer term plan.
###
**How to help?**
Discussion on the roadmap can take place in threads under [Issues](https://github.com/vmware-tanzu/pinniped/issues) or in [community meetings](https://github.com/vmware-tanzu/pinniped/blob/main/CONTRIBUTING.md#meeting-with-the-maintainers). Please open and comment on an issue if you want to provide suggestions and feedback to an item in the roadmap. Please review the roadmap to avoid potential duplicated effort.
###
**Need an idea for a contribution?**
We’ve created an [Opportunity Areas](https://github.com/vmware-tanzu/pinniped/discussions/483) discussion thread that outlines some areas we believe are excellent starting points for the community to get involved. In that discussion we’ve included specific work items that one might consider that also support the high-level items presented in our roadmap.
###
**How to add an item to the roadmap?**
Please open an issue to track any initiative on the roadmap of Pinniped (usually driven by new feature requests). We will work with and rely on our community to focus our efforts to improve Pinniped.
###
**Current Roadmap**
The following table includes the current roadmap for Pinniped. If you have any questions or would like to contribute to Pinniped, please attend a [community meeting](https://github.com/vmware-tanzu/pinniped/blob/main/CONTRIBUTING.md#meeting-with-the-maintainers) to discuss with our team. If you don't know where to start, we are always looking for contributors that will help us reduce technical, automation, and documentation debt. Please take the timelines & dates as proposals and goals. Priorities and requirements change based on community feedback, roadblocks encountered, community contributions, etc. If you depend on a specific item, we encourage you to attend community meetings to get updated status information, or help us deliver that feature by contributing to Pinniped.
Last Updated: March 2021
Theme|Description|Timeline|
|--|--|--|
|Impersonation Proxy|Adds support for more types of clusters (managed services)|Mar 2021|
|Device Code Flow|Add support for OAuth 2.0 Device Authorization Grant in the Pinniped CLI and Supervisor|Apr 2021|
|Improved Documentation|Reorganizing and improving Pinniped docs; new how-to guides and tutorials|May 2021|
|CLI Improvements|Improving CLI UX for setting up Supervisor IDPs|May 2021|
|Multiple IDPs|Support for multiple upstream IDPs to be configured simultaneously|Jun 2021|
|Improving Security Posture|Offer the best security posture for Kubernetes cluster authentication|Exploring/Ongoing|
|Improve our CI/CD systems|Upgrade tests; make Kind more efficient and reliable for CI ; Windows tests; performance tests; scale tests; soak tests|Exploring/Ongoing|
|Telemetry|Adding some useful phone home metrics as well as some vanity metrics|Exploring/Ongoing|
|Observability|Expose Pinniped metrics through Prometheus Integration|Exploring/Ongoing|
Please see https://github.com/vmware/pinniped/blob/main/SECURITY.md
# Security Release Process
Pinniped provides identity services for Kubernetes clusters. The community has adopted this security disclosure and response policy to ensure we responsibly handle critical issues.
## Supported Versions
As of right now, only the latest version of Pinniped is supported.
## Reporting a Vulnerability - Private Disclosure Process
Security is of the highest importance and all security vulnerabilities or suspected security vulnerabilities should be reported to Pinniped privately, to minimize attacks against current users of Pinniped before they are fixed. Vulnerabilities will be investigated and patched on the next patch (or minor) release as soon as possible. This information could be kept entirely internal to the project.
If you know of a publicly disclosed security vulnerability for Pinniped, please **IMMEDIATELY** contact the VMware Security Team (security@vmware.com). The use of encrypted email is encouraged. The public PGP key can be found at https://kb.vmware.com/kb/1055.
**IMPORTANT: Do not file public issues on GitHub for security vulnerabilities**
To report a vulnerability or a security-related issue, please contact the VMware email address with the details of the vulnerability. The email will be fielded by the VMware Security Team and then shared with the Pinniped maintainers who have committer and release permissions. Emails will be addressed within 3 business days, including a detailed plan to investigate the issue and any potential workarounds to perform in the meantime. Do not report non-security-impacting bugs through this channel. Use [GitHub issues](https://github.com/vmware-tanzu/pinniped/issues/new/choose) instead.
## Proposed Email Content
Provide a descriptive subject line and in the body of the email include the following information:
* Basic identity information, such as your name and your affiliation or company.
* Detailed steps to reproduce the vulnerability (POC scripts, screenshots, and logs are all helpful to us).
* Description of the effects of the vulnerability on Pinniped and the related hardware and software configurations, so that the VMware Security Team can reproduce it.
* How the vulnerability affects Pinniped usage and an estimation of the attack surface, if there is one.
* List other projects or dependencies that were used in conjunction with Pinniped to produce the vulnerability.
## When to report a vulnerability
* When you think Pinniped has a potential security vulnerability.
* When you suspect a potential vulnerability but you are unsure that it impacts Pinniped.
* When you know of or suspect a potential vulnerability on another project that is used by Pinniped.
## Patch, Release, and Disclosure
The VMware Security Team will respond to vulnerability reports as follows:
1. The Security Team will investigate the vulnerability and determine its effects and criticality.
2. If the issue is not deemed to be a vulnerability, the Security Team will follow up with a detailed reason for rejection.
3. The Security Team will initiate a conversation with the reporter within 3 business days.
4. If a vulnerability is acknowledged and the timeline for a fix is determined, the Security Team will work on a plan to communicate with the appropriate community, including identifying mitigating steps that affected users can take to protect themselves until the fix is rolled out.
5. The Security Team will also create a [CVSS](https://www.first.org/cvss/specification-document) using the [CVSS Calculator](https://www.first.org/cvss/calculator/3.0). The Security Team makes the final call on the calculated CVSS; it is better to move quickly than making the CVSS perfect. Issues may also be reported to [Mitre](https://cve.mitre.org/) using this [scoring calculator](https://nvd.nist.gov/vuln-metrics/cvss/v3-calculator). The CVE will initially be set to private.
6. The Security Team will work on fixing the vulnerability and perform internal testing before preparing to roll out the fix.
7. The Security Team will provide early disclosure of the vulnerability by emailing the [Pinniped Distributors](https://groups.google.com/g/project-pinniped-distributors) mailing list. Distributors can initially plan for the vulnerability patch ahead of the fix, and later can test the fix and provide feedback to the Pinniped team. See the section **Early Disclosure to Pinniped Distributors List** for details about how to join this mailing list.
8. A public disclosure date is negotiated by the VMware SecurityTeam, the bug submitter, and the distributors list. We prefer to fully disclose the bug as soon as possible once a user mitigation or patch is available. It is reasonable to delay disclosure when the bug or the fix is not yet fully understood, the solution is not well-tested, or for distributor coordination. The timeframe for disclosure is from immediate (especially if it’s already publicly known) to a few weeks. For a critical vulnerability with a straightforward mitigation, we expect the report date for the public disclosure date to be on the order of 14 business days. The VMware Security Team holds the final say when setting a public disclosure date.
9. Once the fix is confirmed, the Security Team will patch the vulnerability in the next patch or minor release, and backport a patch release into all earlier supported releases. Upon release of the patched version of Pinniped, we will follow the **Public Disclosure Process**.
## Public Disclosure Process
The Security Team publishes a [public advisory](https://github.com/vmware-tanzu/pinniped/security/advisories) to the Pinniped community via GitHub. In most cases, additional communication via Slack, Twitter, mailing lists, blog and other channels will assist in educating Pinniped users and rolling out the patched release to affected users.
The Security Team will also publish any mitigating steps users can take until the fix can be applied to their Pinniped instances. Pinniped distributors will handle creating and publishing their own security advisories.
## Mailing lists
* Use security@vmware.com to report security concerns to the VMware Security Team, who uses the list to privately discuss security issues and fixes prior to disclosure. The use of encrypted email is encouraged. The public PGP key can be found at https://kb.vmware.com/kb/1055.
* Join the [Pinniped Distributors](https://groups.google.com/g/project-pinniped-distributors) mailing list for early private information and vulnerability disclosure. Early disclosure may include mitigating steps and additional information on security patch releases. See below for information on how Pinniped distributors or vendors can apply to join this list.
## Early Disclosure to Pinniped Distributors List
The private list is intended to be used primarily to provide actionable information to multiple distributor projects at once. This list is not intended to inform individuals about security issues.
## Membership Criteria
To be eligible to join the [Pinniped Distributors](https://groups.google.com/g/project-pinniped-distributors) mailing list, you should:
1. Be an active distributor of Pinniped.
2. Have a user base that is not limited to your own organization.
3. Have a publicly verifiable track record up to the present day of fixing security issues.
4. Not be a downstream or rebuild of another distributor.
5. Be a participant and active contributor in the Pinniped community.
6. Accept the Embargo Policy that is outlined below.
7. Have someone who is already on the list vouch for the person requesting membership on behalf of your distribution.
**The terms and conditions of the Embargo Policy apply to all members of this mailing list. A request for membership represents your acceptance to the terms and conditions of the Embargo Policy.**
## Embargo Policy
The information that members receive on the Pinniped Distributors mailing list must not be made public, shared, or even hinted at anywhere beyond those who need to know within your specific team, unless you receive explicit approval to do so from the VMware Security Team. This remains true until the public disclosure date/time agreed upon by the list. Members of the list and others cannot use the information for any reason other than to get the issue fixed for your respective distribution's users.
Before you share any information from the list with members of your team who are required to fix the issue, these team members must agree to the same terms, and only be provided with information on a need-to-know basis.
In the unfortunate event that you share information beyond what is permitted by this policy, you must urgently inform the VMware Security Team (security@vmware.com) of exactly what information was leaked and to whom. If you continue to leak information and break the policy outlined here, you will be permanently removed from the list.
## Requesting to Join
Send new membership requests to https://groups.google.com/g/project-pinniped-distributors. In the body of your request please specify how you qualify for membership and fulfill each criterion listed in the Membership Criteria section above.
## Confidentiality, integrity and availability
We consider vulnerabilities leading to the compromise of data confidentiality, elevation of privilege, or integrity to be our highest priority concerns. Availability, in particular in areas relating to DoS and resource exhaustion, is also a serious security concern. The VMware Security Team takes all vulnerabilities, potential vulnerabilities, and suspected vulnerabilities seriously and will investigate them in an urgent and expeditious manner.
Short:"Generate a Pinniped-based kubeconfig for a cluster",
SilenceUsage:true,
}
flagsgetKubeconfigParams
namespacestring// unused now
)
f:=cmd.Flags()
f.StringVar(&flags.staticToken,"static-token","","Instead of doing an OIDC-based login, specify a static token")
f.StringVar(&flags.staticTokenEnvName,"static-token-env","","Instead of doing an OIDC-based login, read a static token from the environment")
f.BoolVar(&flags.concierge.disabled,"no-concierge",false,"Generate a configuration which does not use the Concierge, but sends the credential to the cluster directly")
f.StringVar(&namespace,"concierge-namespace","pinniped-concierge","Namespace in which the Concierge was installed")
f.StringVar(&flags.concierge.credentialIssuer,"concierge-credential-issuer","","Concierge CredentialIssuer object to use for autodiscovery (default: autodiscover)")
f.StringVar(&flags.concierge.authenticatorType,"concierge-authenticator-type","","Concierge authenticator type (e.g., 'webhook', 'jwt') (default: autodiscover)")
f.StringVar(&flags.concierge.authenticatorName,"concierge-authenticator-name","","Concierge authenticator name (default: autodiscover)")
f.StringVar(&flags.concierge.apiGroupSuffix,"concierge-api-group-suffix",groupsuffix.PinnipedDefaultSuffix,"Concierge API group suffix")
f.BoolVar(&flags.concierge.skipWait,"concierge-skip-wait",false,"Skip waiting for any pending Concierge strategies to become ready (default: false)")
f.Var(&flags.concierge.caBundle,"concierge-ca-bundle","Path to TLS certificate authority bundle (PEM format, optional, can be repeated) to use when connecting to the Concierge")
f.StringVar(&flags.concierge.endpoint,"concierge-endpoint","","API base for the Concierge endpoint")
f.Var(&flags.concierge.mode,"concierge-mode","Concierge mode of operation")
f.StringVar(&flags.oidc.clientID,"oidc-client-id","pinniped-cli","OpenID Connect client ID (default: autodiscover)")
f.Uint16Var(&flags.oidc.listenPort,"oidc-listen-port",0,"TCP port for localhost listener (authorization code flow only)")
f.StringSliceVar(&flags.oidc.scopes,"oidc-scopes",[]string{oidc.ScopeOfflineAccess,oidc.ScopeOpenID,"pinniped:request-audience"},"OpenID Connect scopes to request during login")
f.BoolVar(&flags.oidc.skipBrowser,"oidc-skip-browser",false,"During OpenID Connect login, skip opening the browser (just print the URL)")
f.StringVar(&flags.oidc.sessionCachePath,"oidc-session-cache","","Path to OpenID Connect session cache file")
f.Var(&flags.oidc.caBundle,"oidc-ca-bundle","Path to TLS certificate authority bundle (PEM format, optional, can be repeated)")
f.BoolVar(&flags.oidc.debugSessionCache,"oidc-debug-session-cache",false,"Print debug logs related to the OpenID Connect session cache")
f.StringVar(&flags.oidc.requestAudience,"oidc-request-audience","","Request a token with an alternate audience using RFC8693 token exchange")
f.StringVar(&flags.kubeconfigPath,"kubeconfig",os.Getenv("KUBECONFIG"),"Path to kubeconfig file")
f.StringVar(&flags.kubeconfigContextOverride,"kubeconfig-context","","Kubeconfig context name (default: current active context)")
f.BoolVar(&flags.skipValidate,"skip-validation",false,"Skip final validation of the kubeconfig (default: false)")
f.DurationVar(&flags.timeout,"timeout",10*time.Minute,"Timeout for autodiscovery and validation")
returnnil,fmt.Errorf("multiple authenticators were found, so the --concierge-authenticator-type/--concierge-authenticator-name flags must be specified")
cmd.Flags().Uint16Var(&flags.listenPort,"listen-port",0,"TCP port for localhost listener (authorization code flow only)")
cmd.Flags().StringSliceVar(&flags.scopes,"scopes",[]string{oidc.ScopeOfflineAccess,oidc.ScopeOpenID,"pinniped:request-audience"},"OIDC scopes to request during login")
cmd.Flags().BoolVar(&flags.skipBrowser,"skip-browser",false,"Skip opening the browser (just print the URL)")
cmd.Flags().StringVar(&flags.sessionCachePath,"session-cache",filepath.Join(mustGetConfigDir(),"sessions.yaml"),"Path to session cache file")
cmd.Flags().StringSliceVar(&flags.caBundlePaths,"ca-bundle",nil,"Path to TLS certificate authority bundle (PEM format, optional, can be repeated)")
cmd.Flags().StringSliceVar(&flags.caBundleData,"ca-bundle-data",nil,"Base64 encoded TLS certificate authority bundle (base64 encoded PEM format, optional, can be repeated)")
cmd.Flags().BoolVar(&flags.debugSessionCache,"debug-session-cache",false,"Print debug logs related to the session cache")
cmd.Flags().StringVar(&flags.requestAudience,"request-audience","","Request a token with an alternate audience using RFC8693 token exchange")
cmd.Flags().BoolVar(&flags.conciergeEnabled,"enable-concierge",false,"Use the Concierge to login")
cmd.Flags().StringVar(&conciergeNamespace,"concierge-namespace","pinniped-concierge","Namespace in which the Concierge was installed")
cmd.Flags().StringVar(&flags.conciergeAuthenticatorType,"concierge-authenticator-type","","Concierge authenticator type (e.g., 'webhook', 'jwt')")
Error: invalid Concierge parameters: invalid API group suffix: a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')
`),
},
{
name:"login error",
args:[]string{
"--client-id","test-client-id",
"--issuer","test-issuer",
},
loginErr:fmt.Errorf("some login error"),
wantOptionsCount:3,
wantError:true,
wantStderr:here.Doc(`
Error: could not complete Pinniped login: some login error
Error: invalid Concierge parameters: invalid API group suffix: a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')
wantStderr:"Error: could not complete WhoAmIRequest (is the Pinniped WhoAmI API running and healthy?): whoamirequests.identity.concierge.pinniped.dev \"whatever\" not found\n",
description:"JWTAuthenticator describes the configuration of a JWT authenticator.
\n Upon receiving a signed JWT, a JWTAuthenticator will performs some validation
on it (e.g., valid signature, existence of claims, etc.) and extract the
username and groups from the token."
properties:
apiVersion:
description:'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type:string
kind:
description:'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type:string
metadata:
type:object
spec:
description:Spec for configuring the authenticator.
properties:
audience:
description:Audience is the required value of the "aud" JWT claim.
minLength:1
type:string
claims:
description:Claims allows customization of the claims that will be
mapped to user identity for Kubernetes access.
properties:
groups:
description:Groups is the name of the claim which should be read
to extract the user's group membership from the JWT token. When
not specified, it will default to "groups".
type:string
username:
description:Username is the name of the claim which should be
read to extract the username from the JWT token. When not specified,
it will default to "username".
type:string
type:object
issuer:
description:Issuer is the OIDC issuer URL that will be used to discover
public signing keys. Issuer is also used to validate the "iss" JWT
claim.
minLength:1
pattern:^https://
type:string
tls:
description:TLS configuration for communicating with the OIDC provider.
description:WebhookAuthenticator describes the configuration of a webhook
authenticator.
properties:
apiVersion:
description:'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type:string
kind:
description:'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type:string
metadata:
type:object
spec:
description:Spec for configuring the authenticator.
description:Describes the configuration status of a Pinniped credential issuer.
properties:
apiVersion:
description:'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type:string
kind:
description:'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type:string
metadata:
type:object
status:
description:Status of the credential issuer.
properties:
kubeConfigInfo:
description:Information needed to form a valid Pinniped-based kubeconfig
using this credential issuer. This field is deprecated and will
be removed in a future version.
properties:
certificateAuthorityData:
description:The K8s API server CA bundle.
minLength:1
type:string
server:
description:The K8s API server URL.
minLength:1
pattern:^https://|^http://
type:string
required:
- certificateAuthorityData
- server
type:object
strategies:
description:List of integration strategies that were attempted by
Pinniped.
items:
description:Status of an integration strategy that was attempted
by Pinniped.
properties:
frontend:
description:Frontend describes how clients can connect using
this strategy.
properties:
impersonationProxyInfo:
description:ImpersonationProxyInfo describes the parameters
for the impersonation proxy on this Concierge. This field
is only set when Type is "ImpersonationProxy".
properties:
certificateAuthorityData:
description:CertificateAuthorityData is the base64-encoded
PEM CA bundle of the impersonation proxy.
minLength:1
type:string
endpoint:
description:Endpoint is the HTTPS endpoint of the impersonation
proxy.
minLength:1
pattern:^https://
type:string
required:
- certificateAuthorityData
- endpoint
type:object
tokenCredentialRequestInfo:
description:TokenCredentialRequestAPIInfo describes the
parameters for the TokenCredentialRequest API on this
Concierge. This field is only set when Type is "TokenCredentialRequestAPI".
properties:
certificateAuthorityData:
description:CertificateAuthorityData is the base64-encoded
Kubernetes API server CA bundle.
minLength:1
type:string
server:
description:Server is the Kubernetes API server URL.
minLength:1
pattern:^https://|^http://
type:string
required:
- certificateAuthorityData
- server
type:object
type:
description:Type describes which frontend mechanism clients
can use with a strategy.
enum:
- TokenCredentialRequestAPI
- ImpersonationProxy
type:string
required:
- type
type:object
lastUpdateTime:
description:When the status was last checked.
format:date-time
type:string
message:
description:Human-readable description of the current status.
minLength:1
type:string
reason:
description:Reason for the current status.
enum:
- Listening
- Pending
- Disabled
- ErrorDuringSetup
- CouldNotFetchKey
- CouldNotGetClusterInfo
- FetchedKey
type:string
status:
description:Status of the attempted integration strategy.
image_digest:#! e.g. sha256:f3c4fdfd3ef865d4b97a1fd295d94acc3f0c654c46b6f27ffad5cf80216903c8
image_tag:latest
#! Optionally specify a different image for the "kube-cert-agent" pod which is scheduled
#! on the control plane. This image needs only to include `sleep` and `cat` binaries.
#! By default, the same image specified for image_repo/image_digest/image_tag will be re-used.
kube_cert_agent_image:
#! Specifies a secret to be used when pulling the above `image_repo` container image.
#! Can be used when the above image_repo is a private registry.
#! Typically the value would be the output of: kubectl create secret docker-registry x --docker-server=https://example.io --docker-username="USERNAME" --docker-password="PASSWORD" --dry-run=client -o json | jq -r '.data[".dockerconfigjson"]'
#! Optional.
image_pull_dockerconfigjson:#! e.g. {"auths":{"https://registry.example.com":{"username":"USERNAME","password":"PASSWORD","auth":"BASE64_ENCODED_USERNAME_COLON_PASSWORD"}}}
#! Pinniped will try to guess the right K8s API URL for sharing that information with potential clients.
#! This settings allows the guess to be overridden.
#! Optional.
discovery_url:#! e.g., https://example.com
#! Specify the duration and renewal interval for the API serving certificate.
#! The defaults are set to expire the cert about every 30 days, and to rotate it
image_digest:#! e.g. sha256:f3c4fdfd3ef865d4b97a1fd295d94acc3f0c654c46b6f27ffad5cf80216903c8
image_tag:latest
#! Specifies a secret to be used when pulling the above `image_repo` container image.
#! Can be used when the above image_repo is a private registry.
#! Typically the value would be the output of: kubectl create secret docker-registry x --docker-server=https://example.io --docker-username="USERNAME" --docker-password="PASSWORD" --dry-run=client -o json | jq -r '.data[".dockerconfigjson"]'
#! Optional.
image_pull_dockerconfigjson:#! e.g. {"auths":{"https://registry.example.com":{"username":"USERNAME","password":"PASSWORD","auth":"BASE64_ENCODED_USERNAME_COLON_PASSWORD"}}}
run_as_user:1001#! run_as_user specifies the user ID that will own the process
run_as_group:1001#! run_as_group specifies the group ID that will own the process
description:FederationDomain describes the configuration of an OIDC provider.
properties:
apiVersion:
description:'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type:string
kind:
description:'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type:string
metadata:
type:object
spec:
description:Spec of the OIDC provider.
properties:
issuer:
description:"Issuer is the OIDC Provider's issuer, per the OIDC Discovery
Metadata document, as well as the identifier that it will use for
the iss claim in issued JWTs. This field will also be used as the
base URL for any endpoints used by the OIDC Provider (e.g., if your
issuer is https://example.com/foo, then your authorization endpoint
will look like https://example.com/foo/some/path/to/auth/endpoint).
\n See https://openid.net/specs/openid-connect-discovery-1_0.html#rfc.section.3
for more information."
minLength:1
type:string
tls:
description:TLS configures how this FederationDomain is served over
Transport Layer Security (TLS).
properties:
secretName:
description:"SecretName is an optional name of a Secret in the
same namespace, of type `kubernetes.io/tls`, which contains
the TLS serving certificate for the HTTPS endpoints served by
this FederationDomain. When provided, the TLS Secret named here
must contain keys named `tls.crt` and `tls.key` that contain
the certificate and private key to use for TLS. \n Server Name
Indication (SNI) is an extension to the Transport Layer Security
(TLS) supported by all major browsers. \n SecretName is required
if you would like to use different TLS certificates for issuers
of different hostnames. SNI requests do not include port numbers,
so all issuers with the same DNS hostname must use the same
SecretName value even if they have different port numbers. \n
SecretName is not required when you would like to use only the
HTTP endpoints (e.g. when terminating TLS at an Ingress). It
is also not required when you would like all requests to this
OIDC Provider's HTTPS endpoints to use the default TLS certificate,
which is configured elsewhere. \n When your Issuer URL's host
is an IP address, then this field is ignored. SNI does not work
for IP addresses."
type:string
type:object
required:
- issuer
type:object
status:
description:Status of the OIDC provider.
properties:
lastUpdateTime:
description:LastUpdateTime holds the time at which the Status was
last updated. It is a pointer to get around some undesirable behavior
with respect to the empty metav1.Time value (see https://github.com/kubernetes/kubernetes/issues/86811).
format:date-time
type:string
message:
description:Message provides human-readable details about the Status.
type:string
secrets:
description:Secrets contains information about this OIDC Provider's
secrets.
properties:
jwks:
description:JWKS holds the name of the corev1.Secret in which
this OIDC Provider's signing/verification keys are stored. If
it is empty, then the signing/verification keys are either unknown
or they don't exist.
properties:
name:
description: 'Name of the referent. More info:https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO:Add other useful fields. apiVersion, kind, uid?'
type:string
type:object
stateEncryptionKey:
description:StateSigningKey holds the name of the corev1.Secret
in which this OIDC Provider's key for encrypting state parameters
is stored.
properties:
name:
description: 'Name of the referent. More info:https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO:Add other useful fields. apiVersion, kind, uid?'
type:string
type:object
stateSigningKey:
description:StateSigningKey holds the name of the corev1.Secret
in which this OIDC Provider's key for signing state parameters
is stored.
properties:
name:
description: 'Name of the referent. More info:https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO:Add other useful fields. apiVersion, kind, uid?'
type:string
type:object
tokenSigningKey:
description:TokenSigningKey holds the name of the corev1.Secret
in which this OIDC Provider's key for signing tokens is stored.
properties:
name:
description: 'Name of the referent. More info:https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO:Add other useful fields. apiVersion, kind, uid?'
type:string
type:object
type:object
status:
description:Status holds an enum that describes the state of this
OIDC Provider. Note that this Status can represent success or failure.
enum:
- Success
- Duplicate
- Invalid
- SameIssuerHostMustUseSameSecret
type:string
type:object
required:
- spec
type:object
served:true
storage:true
subresources:
status:{}
status:
acceptedNames:
kind:""
plural:""
conditions:[]
storedVersions:[]
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.