@jzachr - ROAST.dev

Personality

Architecturally-minded Forward-thinking about maintenance Collaborative and consensus-seeking Detail-oriented about naming and patterns Pragmatic about trade-offs Security-conscious Process-oriented Patient explainer

Greatest Hits

"I would consider adding"

"This seems like a really bad pattern"

"We really should make this configurable"

"This is fine for now"

"I would not have anything in the client that mentions"

"This creates really bad coupling between systems"

"Will be addressed in a future PR"

Focus Areas

System architecture and coupling
Configuration and maintainability
Naming conventions and clarity
Security implications
Feature flag patterns
Future-proofing code
API design
Code organization

Common Phrases

"I think it would be" "I would consider" "This looks good" "I don't think we" "This seems like" "I would not have" "Would it be worth" "This is fine, but" "I'm approving but" "This should still" "We really should make" "I would change naming" "This is not applicable" "Will be addressed in a future PR" "Follow up"

Sentiment Breakdown

questioning

12

neutral

144

constructive

16

positive

10

harsh_questioning

1

very_positive

1

Review Outcomes

APPROVED

166

CHANGES_REQUESTED

17

COMMENTED

1

DISMISSED

2

Most Reviewed Authors

jzachr

64

KateZhang98

51

frankfeng98

38

colriot

29

jayfish0

22

SuveenE-TF

22

ayc1

21

gauravreddy08

15

lozzle

14

gabgoesfish

13

Spiciest Comments

friday/#300 · pyt/app/evaluation/evaluator/base.py [view]

We should use the new syntax available with Python 3.12+ https://typing.readthedocs.io/en/latest/spec/generics.html#user-defined-generic-types This would be: ```python class EvaluationResult[T, U](BaseModel): success: bool output: T expected: U data: dict | None = None class Evaluator[T, U](ABC): @abstractmethod def extract_result(self, runner_result: RunnerResult) -> T: """Extract relevant data from browser for evaluation""" pass

AI Persona Prompt

You are @jzachr, a senior engineer who reviews code with a strong focus on system architecture and long-term maintainability. You think through problems systematically and often provide detailed explanations of your reasoning. Your reviews tend to be thorough and diplomatic, frequently starting with 'I think it would be' or 'I would consider'. You're particularly concerned about coupling between systems, proper configuration management, and establishing good patterns that won't create technical debt later. You often suggest alternative approaches and explain why they might be better. You're collaborative in your approach, frequently referencing other team members' opinions and building on their feedback. When you see architectural issues, you explain them in detail rather than just pointing them out. You're pragmatic about trade-offs and will approve things that aren't perfect if they're good enough for now, often saying 'This is fine for now' or 'This looks good, but I would consider'. You care deeply about naming conventions and API design. You frequently think about security implications and deployment considerations. You have a habit of acknowledging when you'll address issues in follow-up PRs rather than blocking current work. Your tone is professional and constructive, even when pointing out significant problems. You provide specific, actionable feedback and often include implementation suggestions or links to relevant documentation.

Recent Comments (184 total)

friday/#1018 add multishot eval to goal planner evaluator [view]

Why are we _replacing_ the current `GoalPlannerEvaluator`? I don't see why we wouldn't keep and use both of them. Let's make this `GoalPlannerMultishotEvaluator` so that we can keep both. It is very common to have more than one evaluator for a problem case. I think this is one of those times.

friday/#1018 add multishot eval to goal planner evaluator · pyt/evaluation/evaluator/components/prompts/goal_planner.py [view]

This should still have high level evaluation criteria similar to the guidelines that we give the annotators.

friday/#1233 [cory] refactor api_discovery step · pyt/cory/utils/storage.py [view]

This is fine, but it isn't really a common for handling different "protocols" in fsspec land. Generally the way this should be done is by including a "working directory" or similar which is either "file://", "/....", or "s3://", "gcs://" and then having fsspec figure out which registered filesystem to use. In the `fs_ls` part, _prepend_root would simply prepend whatever the root directory is.

friday/#1193 TF-8681 Update Pathfinder eval e2e runner to use Unikraft + proxy [view]

We really should make this configurable -- but this is fine for now. You might also want to take a look at increasing the parallelization, if we are not running the browsers locally.

friday/#1171 Return false if ignore action [view]

Is this definitely the behavior that we want? What examples did we retest with? I feel like this would break agentql, since it depends on evaluate calls to add `tfid` properties to the elements which are used in locators later?

friday/#1161 Add weblog session id logging · pyt/friday/main.py [view]

Doesn't this need a null check on weblog_config?

friday/#1160 TF-8403 Add RequestMatcher that also matches on method · pyt/friday/services/weblog/request_matcher.py [view]

The number of matches is very small. It would be more expensive to build and use a hashmap in this situation.

friday/#1160 TF-8403 Add RequestMatcher that also matches on method · pyt/friday/services/weblog/request_matcher.py [view]

This is not applicable since we are doing a best effort match. We need to fall back to something.

friday/#1160 TF-8403 Add RequestMatcher that also matches on method · pyt/friday/services/weblog/request_matcher.py [view]

Fixed