Mingyang

@mingyang-tinyfish

collaborative and thoughtful

Mingyang is a thorough, collaborative reviewer who focuses on architectural decisions and system design. He provides detailed technical explanations, asks thoughtful questions about design choices, and often suggests alternative approaches while maintaining a constructive tone.

654

Comments

211

PRs

Repos

215

Avg Chars

Harshness

Personality

Architecturally-minded Collaborative and team-oriented Detail-oriented with explanations Questioning but constructive Future-focused on maintainability Appreciative of good work Pragmatic about trade-offs Proactive in creating tickets and follow-ups

Greatest Hits

"LGTM overall!"

"Thanks for the quick review!"

"Good point! Updated the code"

"I think we can move this to"

"Would it be better for"

"Created a ticket for it"

"Feel free to test it out!"

Focus Areas

system architecture
design patterns
code organization
future maintainability
performance implications
API design
error handling
configuration management

Common Phrases

"I think" "Thanks for" "Good point!" "LGTM overall!" "Let me know if" "Would it be better" "This will be part of" "Feel free to" "I am afraid this will lead to" "As part of the" "Right now" "For now" "We might wanna" "Created a ticket for it" "Thanks for the quick review!"

Sentiment Breakdown

neutral

364

questioning

constructive

positive

harsh_questioning

critical

very_positive

Review Outcomes

APPROVED

COMMENTED

CHANGES_REQUESTED

Most Reviewed Authors

mingyang-tinyfish

259

jinyangTF

157

zifanwTF

wjwjtf

jayfish0

bellatinyfish

ayc1

paveldudka

thakkerurvish

shuhaodo

Spiciest Comments

goldfish/#106 · app/response/merger/merger.py [view]

Clarification: - Those new types are for experimenting and testing purpose. The strategy config is only for internal use and not configurable by the users. For production, it should always refer to the default strategy defined here- https://github.com/tinyfish-io/goldfish/blob/309aed6433920f39594f88a9d1c455b4e64fc549/app/config.py#L47 - We will update the default strategy over time when we find better config set after systematic evaluation with our WEB data. The normalization + randomn

goldfish/#158 · app/llm/model.py [view]

A temporary max-token and max-output-tokens. Feel free to update it when we have a solid number for these configs. @zifanwTF

goldfish/#158 · app/llm/aws_sagemaker.py [view]

@zifanwTF I was referring to the sample code here: https://huggingface.co/codellama/CodeLlama-7b-hf ``` from transformers import AutoTokenizer import transformers import torch model = "codellama/CodeLlama-7b-hf" tokenizer = AutoTokenizer.from_pretrained(model) pipeline = transformers.pipeline( "text-generation", model=model, torch_dtype=torch.float16, device_map="auto", ) sequences = pipeline( 'import socket\n\ndef ping_exponential_backoff(host: str):',

goldfish/#160 · tests/app/llm/model_family_test.py [view]

The encoded values by Mistral and Codellama don't have spaces in each token @zifanwTF Is this expected?

goldfish/#160 [view]

@zifanwTF I switched Tokenizer to AutoTokenizer. Would need your help to look into the following issues: 1. The claude tokenizer as suggested in https://huggingface.co/Xenova/claude-tokenizer cannot be found by AutoTokenizer. `Error msg: ValueError: Tokenizer class ClaudeTokenizer does not exist or is not currently imported. ` 2. The encoded values are different from my previous Tokenizer. Could you plz check if those encoded values make sense? Some seems to be weird. You can find the

goldfish/#160 [view]

> > @zifanwTF I switched Tokenizer to AutoTokenizer. > > Would need your help to look into the following issues: > > > > 1. The claude tokenizer as suggested in https://huggingface.co/Xenova/claude-tokenizer cannot be found by AutoTokenizer. > > `Error msg: ValueError: Tokenizer class ClaudeTokenizer does not exist or is not currently imported. ` > > 2. The encoded values are different from my previous Tokenizer. Could you plz check if those encoded values make sense? Some seems to be

goldfish/#48 · app/llm/open_ai.py [view]

Looks like the max_tokens for OpenAI is referring to the total context window. 4096 may not be applicable to every model. Some other models are referring it to max response tokens, so we would need to review this config individually. @zifanwTF

goldfish/#48 · app/llm/model.py [view]

Here we centralize the default configs for all supported models. I'll add more explicit description to max-tokens once we have a clear understanding of how each model uses it. @zifanwTF

goldfish/#133 · app/response/prompts/gpt3_5/generation/baseline.py [view]

The example messages are appended to this message. In the case, the type hint here is not the end of the user prompt. https://github.com/tinyfish-io/goldfish/blob/9723479c1207ea3357537dde869b260e49881725/app/llm/open_ai.py#L66 @zifanwTF do we need to rearrange it to make sure this line is at the end of the simulated conservation when example messages are given?

goldfish/#191 [view]

> Sorry for chiming in, but this PR looks similar to the Pydantic Model I built. If using the Pydantic Model, with the following response > > `AgentQLResponse(search_btn=None, search_box=None, parent=Parent(child1=109, child2=113), links=[], capcha=[Capcha(name=None, price=None, reviews=[])])` > > You can just do `agentlql_response.model_dump_json()` to get a json string (you can load it to a dict ofc). > > `{"search_btn":null,"search_box":null,"parent":{"child1":109,"child2":113},"link

AI Persona Prompt

You are mingyang-tinyfish, a collaborative and architecturally-focused code reviewer. Your reviews are thorough and thoughtful, often diving deep into design decisions and system architecture. You frequently ask clarifying questions about design choices and suggest alternative approaches when you see potential improvements. Key aspects of your review style: - Always explain the 'why' behind your suggestions with detailed technical context - Use phrases like 'I think', 'Would it be better', 'Thanks for', and 'Good point!' regularly - Often mention future implications and maintainability concerns - Create tickets for follow-up work and reference them in reviews - Appreciate good work with 'LGTM overall!' and 'Thanks for the quick review!' - When you spot issues, you explain them thoroughly rather than just pointing them out - You're collaborative, often asking 'Let me know if' and 'Feel free to' to encourage discussion - You think about the bigger picture and how changes fit into the overall system - You're pragmatic about trade-offs and experimental features, often noting 'For now' or 'Right now' when discussing temporary solutions Focus your reviews on: system design, code organization, configuration management, performance implications, and how changes affect other parts of the system. You're not harsh but you are thorough - you want to understand the reasoning behind decisions and ensure the code is maintainable long-term. Always be constructive and offer specific suggestions when you see areas for improvement.