Add Structured Audit Logging by mrzzy · Pull Request #891 · feast-dev/feast

mrzzy · 2020-07-18T09:43:28Z

Why we need this PR
Feast behavior and usage be opaque due to a lack of consistent logging of:

request/response messages handled by Feast Core/Serving.
The user identity that was adapted.

Logging in Feast is also inconsistent making it difficult to parsed by third party logging systems.

What this PR does:
Adds Audit Logging to Feast Core and Feast Serving:

Added AuditLogger that exposes structured logging methods logMessage(), logAction() etc.
- Make AuditLogger disable/configurable from Core/Serving's application.yml
- Log entries produced by AuditLogger are structured JSON and machine parsable.
Added AuditLogEntry and subclasses that define the structure of each log entries and provides JSON conversion.
Added GrpcMessageInterceptor to intercept incoming/request or outgoing/response and log them to the Audit Log for both Core and Serving Services.
- full request/response available in JSON audit log.
- support for displaying authenticated identity (ie authenticated user).

Refactor/Tech debt cleanups:

Make JobService perform actions on jobs via JobTask instead of directly with JobManager to be consistent with JobCoordinatorService

Which issue(s) this PR fixes:

Fixes #

Does this PR introduce a user-facing change?:

Add Structured Audit Logging
- Audit Log Entries are produced in structured JSON format.
- Audit Logger can be configured from application.yml:
     - feast.logging.audit.enabled - enables and disables audit logging
     - feast.logging.audit.messageEnabled - enables and disables request/response message audit logging.

mrzzy · 2020-07-18T09:47:14Z

common/src/main/java/feast/common/interceptors/GrpcExceptionInterceptor.java

@woop Does the current E2E auth tests verify that there are no false positives for authentication? (ie should fail to authenticate due to wrong credentials. )

#892 These tests will fail your authentication if you try to sign in with a JWT that cannot be verified.

No longer relevant as this was deemed out of scope for this PR.

mrzzy · 2020-07-19T02:41:54Z

/retest

woop · 2020-07-19T06:35:51Z

common/src/main/java/feast/common/logging/AuditLogger.java

Should Log4j2 be hardcoded here? #277

Used SLF4J instead.

woop · 2020-07-19T06:42:24Z

common/src/main/java/feast/common/logging/AuditLogger.java

Shouldn't the format be a part of the logger configuration xml instead of a part of the code base?

The idea was to expose the ability to configure the logger with the most common use cases via application.yml without interacting with the logging configuration XML as it inaccessible from our Helm and Docker-Compose setup.

As mentioned in the response above, I think we can use the application.yml for configuring logging without writing our own logformat configuration.

woop · 2020-07-19T06:56:31Z

common/src/main/java/feast/common/config/LoggingProperties.java

I cant make out this comment. What does the direction represent?

Direction represents the flow of messages relative to the service:

incoming - matches requests messages

outgoing - matches response messages.

Users can use this to configure for which request "direction" message audit logging is enabled. (ie leave in only incoming to enable request logging)

Will change this to request and response as that makes it easier to understand.

woop · 2020-07-19T06:58:01Z

common/src/main/java/feast/common/interceptors/GrpcExceptionInterceptor.java

Should this be in a separate PR?

Yeah, will do.

woop · 2020-07-19T06:58:47Z

common/src/main/java/feast/common/interceptors/GrpcMessageInterceptor.java

Does this work for you? I have been pulling the identity from claims.

woop · 2020-07-19T07:25:26Z

common/src/main/java/feast/common/interceptors/GrpcMessageInterceptor.java

Is this supposed to apply to GetOnlineFeatures as well, because in that case we will probably experience a serious performance hit. Audit logging is normally distinct from request/response logging, since with the former you expect durability and consistency but with the latter you want availability and performance. Logging the complete request/response is still useful as part of the audit log, but we need to have a way to deal with the latency sensitive case.

The AuditLogger can be configured via application.yml via AuditLogProperties and thus can be disabled for the Online Serving instance from application.yml for cases where latency is more important that visibility.

Yep, but we need request/response logging in online serving as well. We just want to make sure that this logging happens in async fashion and doesn't affect latency of the method

woop · 2020-07-19T08:00:23Z

common/src/main/java/feast/common/logging/LogEvent.java

My initial reaction when seeing this code is that it seems quite overwhelming. A lot of decisions have been made here in terms of the structure of our audit logging and I am not sure if we are being too ambitious here. I am quite worried that these abstractions will end up changing quite soon once people start trying to use the audit logger. I was hoping we could start with a simpler key/value API.

Did you follow some kind of convention when structuring the audit logger and its associated types, or was everything custom made?

Did you follow some kind of convention when structuring the audit logger and its associated types, or was everything custom made?

No, Yes.

I was hoping we could start with a simpler key/value API.

Reasons that we should go with a Key Value API.

Infinitely more flexible and extensible as no constraints are applied.

Reasons that we should not go with a Key Value API:

Java is statically typed. Going with a generic Key Value would mean Map<String, Object> which implies lots of ugly casting.

With no enforced structure in place it will be difficult to parse the logs in upstream systems as there is no expected structure.

Enforces some rigidity to logging so that we don't break upstream systems unknowing built upon audit logging.

The structured log will be depended on by upstream logging systems and should be treated as a user facing API that should be clearly defined, the same way Core and Serving Service API are clearly defined.

I am quite worried that these abstractions will end up changing quite soon once people start trying to use the audit logger

The important thing is that we are fully aware of changes instead of being blind sighted by a map.put()

My suggestion is not to throw away all static typing and just accept arbitrary string/object keys and values. Parts of what you have implemented makes sense, but other parts do not.

From a functional perspective what we want is for specific parts of the Feast code base to create a message that can be serialized and printed/sent to the audit log. so having static methods like ofMessage(MessageLogEvent.Kind kind, String method, Message message, String identity) makes sense.

However, I question the value of this LogEvent as a container. It seems like a catchall class that provides little in terms of a contract. When would you have both a MessageLogEvent and ActionLogEvent in the same object? It doesnt seem like that should be possible and so its not clear what the scope of this LogEvent class should be. Will it just grow infinitely as a catchall class that can contain any subclass?

If we agree that maintaining state is a bad thing, then do we need to maintain LogEvents as an object? What is the SLF4J contract that we need to meet. Does it just expect strings?

What prevents us from having an AuditLogger class with static methods like ofMessage that converts a tight contract of objects into a Map or string that can be serialized?

Refactored away LogEvent into MessageAuditLogEntry, ActionAuditLogEntry, TransitionAuditLogEntry subclasses.

woop · 2020-07-19T08:03:53Z

common/src/test/java/feast/common/logging/AuditLogEntryTest.java

What is this testing? Wouldnt toString() always return something?

Yeah, I guess I was just trying to exercise the code path. Would remove.

woop · 2020-07-19T08:08:54Z

core/src/main/java/feast/core/grpc/CoreServiceImpl.java

This code looks much better with the exception handling in the interceptor, but I don't think its in scope for this PR is it?

Yeah, moving out of this PR.

woop · 2020-07-19T08:14:30Z

core/src/main/java/feast/core/job/task/CreateJobTask.java

Why did we add this constructor?

Originally this used the @Builder constructor from lombok and had fields duplicated across all 4 Job Task implementations. I move the fields up to JobTask abstract class however the @builder annotation has problems with inherited fields, hence the added constructor.

woop · 2020-07-19T08:15:43Z

core/src/main/java/feast/core/job/task/JobTask.java

It bothers me a bit that we are pulling in logging dependencies directly into our code. Shouldn't this be abstracted away?

Replaced with SL4J.

woop · 2020-07-19T08:55:04Z

common/src/main/java/feast/common/interceptors/GrpcMessageInterceptor.java

Is it possible for us to keep things simpler by using SLF4J for logging instead of having our own AuditLogger? Example: https://github.com/yidongnan/grpc-spring-boot-starter/blob/master/examples/local-grpc-server/src/main/java/net/devh/boot/grpc/examples/local/server/LogGrpcInterceptor.java#L34

I can see some value in having a way to standardize the message format through static methods, especially if we need to have actions/resources/identities, but having to pass around an audit logger when SLF4J is already available seems like an unnecessary overhead to me.

Is it possible for us to keep things simpler by using SLF4J for logging instead of having our own AuditLogger? Example: https://github.com/yidongnan/grpc-spring-boot-starter/blob/master/examples/local-grpc-server/src/main/java/net/devh/boot/grpc/examples/local/server/LogGrpcInterceptor.java#L34

Why we have our own Audit Logger:

Allow us to configure the audit logger from application.yml which is valuable as:

Allows us to configure the logger via application.yml (ie disable request logging for Online Serving without hard coding something.)

We can inject Feast specific elements into the Audit Log (ie Feast Component, Feast Version, or in the future git hash.)

Allows us to configure the logger via application.yml (ie disable request logging for Online Serving without hard coding something.)

What have we implemented that can't be implemented using the normal logging configuration https://howtodoinjava.com/spring-boot2/logging/configure-logging-application-yml/

Also, it seems like your answer focuses on configuration but the AuditLogger is more than just a configuration holder right? We can have custom configuration in the FeastProperties for enabling/disabling request/response logging if needed, but that is a separate question to whether we need to maintain the AuditLogger and pass it around.

We can inject Feast specific elements into the Audit Log

Since we have an application context just floating around inside of Feast, we can easily grab these settings at any time without needing to pass around an AuditLogger right? I am just trying to make sure we arent reinventing something that already exists.

Updated AuditLogger's logging methods to static to remove the need of passing around the AuditLogger instance.

woop · 2020-07-19T08:59:34Z

serving/pom.xml

Do we really need to set this? Shouldnt this be set in the parent-pom?

This was set as the interceptors do not work properly under the lognet grpc starter.

Removed version part to be set in the parent POM.

woop · 2020-07-19T09:13:33Z

I think it is unlikely that we will get this PR merged in any time soon in its current state. It's trying to do too much.

New AuditLogger/LogEvents with custom messages
Global exception handler
Request/response logging
Job transition logging
Replacement of exception logging

The MVP we need for audit logging is not much different from what we have in our existing AuditLogger. We need a way to log incoming requests to track user actions. Tracking internal state changes of jobs are not necessary yet, and even in that case I think it can be tracked at the entity level.

I would recommend that you try and separate these into different PRs. The first and most important PR implements a logging interceptor that logs authenticated user requests and the incoming payload using a standardized audit log format (hopefully with SLF4J). This might serve as inspiration https://cloud.google.com/logging/docs/reference/audit/auditlog/rest/Shared.Types/AuditLog

mrzzy · 2020-07-20T03:09:13Z

Job transition logging.

This already exists with the current AuditLogger

Next steps:

Move exception logging/Global exception handler to a separate PR.
Make logging code dependent on SLF4J instead of Log4j2 Standardize logging libraries for Java components #277
Make AuditLogger's logging methods static to remove the need of passing it around in code.
Refactor away LogEvent into MessageAuditLogEntry, ActionAuditLogEntry, TransitionAuditLogEntry subclasses.
Refactor to make sure that AuditLogger's is an facade to audit logging functionality.

core/src/main/java/feast/core/config/CoreLoggingConfig.java

woop · 2020-07-20T10:10:26Z

core/src/main/java/feast/core/config/FeastProperties.java

Why was this changed?

mrzzy · 2020-07-27T03:26:54Z

/test test-end-to-end-batch

mrzzy · 2020-07-27T05:54:49Z

/test test-end-to-end-batch

mrzzy · 2020-07-27T05:55:00Z

/test test-end-to-end-batch-dataflow

woop · 2020-07-27T09:43:07Z

common/src/main/java/feast/common/logging/config/LoggingProperties.java

Can we call this something more descriptive? enableMessageLogging?

Ok. updated.

mrzzy · 2020-07-28T04:29:04Z

/test test-end-to-end

mrzzy · 2020-07-28T04:29:11Z

/test test-end-to-end-redis-cluster

mrzzy · 2020-07-28T04:29:17Z

/test test-end-to-end-batch-dataflow

mrzzy · 2020-07-28T07:39:08Z

/test test-end-to-end-batch-dataflow

woop · 2020-07-28T08:51:25Z

common/src/main/java/feast/common/logging/AuditLogger.java

Still not sure why we need to persist with FeastInstance.

mrzzy · 2020-07-30T04:04:26Z

/test test-end-to-end-batch-dataflow

woop · 2020-07-30T04:09:23Z

/lgtm

feast-ci-bot · 2020-07-30T04:09:38Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mrzzy, woop

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [woop]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

mrzzy · 2020-07-30T04:58:45Z

/test test-end-to-end-batch-dataflow

mrzzy · 2020-07-30T07:07:08Z

/test test-end-to-end-batch-dataflow

feast-ci-bot · 2020-07-30T07:37:06Z

New changes are detected. LGTM label has been removed.

mrzzy requested review from davidheryanto, khorshuheng, pyalex, woop and zhilingc as code owners July 18, 2020 09:43

feast-ci-bot added do-not-merge/work-in-progress needs-kind size/XXL labels Jul 18, 2020

mrzzy added kind/feature New feature or request kind/techdebt labels Jul 18, 2020

feast-ci-bot removed the needs-kind label Jul 18, 2020

mrzzy commented Jul 18, 2020

View reviewed changes

woop reviewed Jul 19, 2020

View reviewed changes

woop reviewed Jul 20, 2020

View reviewed changes

core/src/main/java/feast/core/config/CoreLoggingConfig.java Outdated Show resolved Hide resolved

woop reviewed Jul 20, 2020

View reviewed changes

woop reviewed Jul 27, 2020

View reviewed changes

mrzzy closed this Jul 28, 2020

mrzzy reopened this Jul 28, 2020

woop reviewed Jul 28, 2020

View reviewed changes

feast-ci-bot added size/XL size/XXL and removed size/XXL size/XL labels Jul 30, 2020

feast-ci-bot assigned woop Jul 30, 2020

feast-ci-bot added the lgtm label Jul 30, 2020

woop approved these changes Jul 30, 2020

View reviewed changes

feast-ci-bot added the approved label Jul 30, 2020

Squash audit logging PR and rebase on master

274c99d

feast-ci-bot removed the lgtm label Jul 30, 2020

woop merged commit 8acab49 into feast-dev:master Jul 30, 2020

mrzzy mentioned this pull request Aug 2, 2020

Backports for v0.6.2 #918

Merged

pyalex pushed a commit that referenced this pull request Aug 2, 2020

Add Structured Audit Logging (#891)

a466f64

Conversation

mrzzy commented Jul 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mrzzy commented Jul 19, 2020

Uh oh!

woop Jul 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mrzzy Jul 20, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mrzzy Jul 20, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

woop Jul 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

woop Jul 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mrzzy commented Jul 18, 2020 •

edited

Loading

woop Jul 19, 2020 •

edited

Loading

mrzzy Jul 20, 2020 •

edited

Loading

mrzzy Jul 20, 2020 •

edited

Loading

woop Jul 19, 2020 •

edited

Loading

woop Jul 19, 2020 •

edited

Loading

mrzzy Jul 23, 2020 •

edited

Loading

mrzzy commented Jul 20, 2020 •

edited

Loading