Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add telemetry for service entries #6047

Draft
wants to merge 8 commits into
base: dev
Choose a base branch
from

Conversation

Nigusu-Allehu
Copy link
Contributor

@Nigusu-Allehu Nigusu-Allehu commented Sep 23, 2024

Bug

Fixes: NuGet/Home#13744

Description

This PR adds telemetry for service index resources that are HTTP, but their source is HTTPS.
for example a v3 source with the following index would end up using a non https service endpoint.

{
    "version": "3.0.0",
    "resources": [
        {
            "@id": "http://source",
            "@type": "SearchQueryService",
            "comment": "Query endpoint of NuGet Search service (primary)"
        }]
}

The data logged is as follows

  • Source uri - hashed
  • Is source HTTP
  • Is source HTTPS
  • Is resource uri HTTP?
  • is resource uri HTTPS?
  • What is the resource type

This information is logged at the creation of service index entries. These entries are created when a client actually needs to make contact with the source.

PR Checklist

  • Meaningful title, helpful description and a linked NuGet/Home issue
  • Added tests
  • Link to an issue or pull request to update docs if this PR changes settings, environment variables, new feature, etc.

@Nigusu-Allehu Nigusu-Allehu requested a review from a team as a code owner September 23, 2024 20:29
@Nigusu-Allehu Nigusu-Allehu self-assigned this Sep 23, 2024
@Nigusu-Allehu Nigusu-Allehu marked this pull request as draft September 23, 2024 20:29
@Nigusu-Allehu Nigusu-Allehu marked this pull request as ready for review September 25, 2024 00:02
}

var telemetry = new ServiceIndexEntryTelemetry(source, _packageSource.IsHttp, _packageSource.IsHttps, isResourceHTTP, isResourceHTTPs, type, "ServiceIndexEntrySummary");
TelemetryActivity.EmitTelemetryEvent(telemetry);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Firstly, from https://github.com/NuGet/NuGet.Client/blob/dev/docs/nuget-sdk.md#changes-to-apis

I don't feel like this is a good reason to introduce a new public API.

Secondly, I have a personal preference that only one constructor should have an empty body. So, instead of adding a new constructor that invokes the old instructor, and then does extra work, change the old constructor to call the new constructor, and move all the code from the old constructor to the new one. This way there's only 1 constructor with code, and there's less chance of bugs when someone makes a change and breaks one of the constructors.

Finally, instances of this class are going to be created all at the same time, so it will flood VS with events all at once. NuGet.org's service index has 39 entries, by my count.

We already have a lot of telemetry being throttled and therefore lost. I imagine that sending 39 entries, just for nuget.org (plus tens more for each other package source) will make telemetry throttling much worse. It's better to emit a single event, which contain a whole lot of properties (or one "complex property" with a lot of data) from ServiceIndexResourceV3, or ServiceIndexV3Provider. That will still emit one event per package source, so honestly, I'd prefer telemetry at an even higher level where there can be a single event with a big payload.

Look at dotnet/sdk for example, it has 24 sources configured. and looking at the first service index, it has 11 entries. So, that's probably 264 events that this PR will currently try to emit.

Honestly though, I'd prefer it if this telemetry was either in RestoreCommand, or somewhere in src\NuGet.Clients\*, not in NuGet.Protocol. In order to make that work, the telemetry code will need to sourceRepository.GetResourceAsync<ServiceIndexV3Resource>() and then iterate the entries. But this has a problem when the service index hasn't already been fetched. If it hasn't already been fetched, then just by requesting the resource, it will fetch it, and we don't want to make unnecessary HTTP requests. Especially if the server needs authentication, and valid credentials aren't provided (and can't be obtained from a cred provider), then NuGet will cause a UI popup asking for credentials. We definately don't want to create a modal dialog for credentials just to gather some telemetry.

Perhaps we could introduce a LazyServiceIndexV3Resource, which, similar to the Lazy<T> type, has some kind of HasValue property, so the telemetry can get this resource, without causing a download, and then change ServiceIndexV3Resource to get the lazy resource and immediately get its value. If ServiceIndexV3Resource didn't have Entries and Json properties, we could have just modified it to add a HasEntries property. It's best to talk to a few people about the idea first, rather than spending a lot of time implementing something and then having different people give you conflicting feedback.

@Nigusu-Allehu Nigusu-Allehu marked this pull request as draft September 27, 2024 20:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Collect telemetry for HTTPS sources with HTTP resources
2 participants