Mastering Gemini CLI
Learn how to leverage Gemini CLI—an interactive, highly-capable AI agent that specializes in autonomous software engineering tasks directly from your terminal.
Module 1: Meet Your New Peer Engineer
What exactly is Gemini CLI?
Gemini CLI is not just another chatbot window you paste code into. It is a state-of-the-art interactive CLI agent designed to operate autonomously within your local workspace. Think of it as a senior engineer sitting next to you, capable of reading your code, running shell commands, and applying surgical code edits safely.
Core Capabilities
- Workspace Navigation: It searches, greps, and reads local files to build its own context dynamically. You don't need to manually upload files.
- Command Execution: It can run terminal commands like
npm run test,tsc, orgit statusto validate its work natively. - Surgical Edits: Instead of dumping huge blocks of code for you to copy-paste, it edits files directly using search-and-replace tools.
- Sub-agent Orchestration: It delegates repetitive or complex codebase investigations to specialized background sub-agents.
Security & System Integrity First
The CLI is programmed with strict security mandates. It will never log, print, or commit secrets or `.env` variables. Furthermore, it strictly adheres to your local coding conventions and architecture, blending into your existing project seamlessly. It will always explain critical commands that modify state before executing them.
Module 2: The Development Lifecycle
Research, Strategy, Execution
Gemini CLI operates using a strict, professional engineering methodology. It does not blindly guess how your code works; it verifies it through a structured loop.
1. Research Phase
Before touching any code, the agent uses `grep_search` and `glob` to map the codebase and validate assumptions. If you report a bug, it will attempt to empirically reproduce the failure state first.
2. Strategy Phase
Based on its findings, it formulates a grounded plan and briefly summarizes its strategy before making changes. It distinguishes between Inquiries (answering questions) and Directives (doing work).
3. Execution Phase (Plan ➔ Act ➔ Validate)
The CLI plans its specific implementation, applies targeted changes, and exhaustively validates the result by running local build scripts, linters, or unit tests.
Validation is Mandatory
Unlike web-based LLMs that just provide code, Gemini CLI's job isn't done until the code runs. If a TypeScript compiler throws an error, the agent reads the error, backtracks, and fixes its own code autonomously until the build turns green.
Module 3: Core Tools Deep Dive
How the agent interacts with your OS
Gemini CLI doesn't use magic; it uses specific, highly-optimized API tools to perform tasks safely and efficiently. Understanding these tools helps you understand the agent's behavior.
grep_search: Powered by ripgrep. It is exceptionally fast and automatically limits output to prevent flooding the context window. It's preferred over standard terminal grep.read_file: Reads file contents. To save tokens, the agent uses `start_line` and `end_line` parameters for surgical reads rather than dumping thousands of lines into memory.replace&write_file: The primary tools for editing.replacemodifies a specific block of text, whilewrite_filecreates or overwrites entire files.run_shell_command: Executes scripts. The agent prefers non-interactive flags (like `npm install --silent` or `git --no-pager`) to reduce output noise.ask_user: When a request is ambiguous, the agent can pause and prompt you with multi-choice or text questions to clarify requirements before proceeding.
Module 4: Sub-agent Orchestration
Delegating complex tasks
As an orchestrator, Gemini CLI protects its own context window. If you ask it to do something massive, it delegates the work to specialized background sub-agents.
The Generalist
Used for repetitive batch tasks (e.g., "Add license headers to all 50 files in src/") or running commands with high-volume output. The generalist does the heavy lifting and returns a single summary to the main session.
Codebase Investigator
Used for architectural mapping and bug root-cause analysis. If you say "Why is the authentication flow failing?", it dispatches the investigator to map dependencies and return actionable insights.
Module 5: Specialized Agent Skills
Activating expert workflows
Beyond its core capabilities, Gemini CLI comes equipped with specialized "skills" that can be activated on demand. These skills provide the agent with expert procedural guidance and workflows for specific problem domains.
Available Skills
chrome-devtools: Advanced browser automation, debugging, and performance analysis via MCP.debug-optimize-lcp: Expert guidance on optimizing Largest Contentful Paint (LCP) and Core Web Vitals.a11y-debugging: Comprehensive accessibility auditing based on web.dev guidelines.troubleshooting: Automated diagnostics for connection and environment issues.skill-creator: Yes, the agent even has a skill to help you create *new* custom skills!
When you ask the CLI to perform a task matching one of these domains, it uses the activate_skill tool to load specialized instructions, temporarily transforming into a domain expert.
Module 6: Web Research & Information Gathering
Accessing the outside world
Gemini CLI isn't limited to your local filesystem. It has native tools to research modern documentation, troubleshoot obscure errors, or pull external resources.
google_web_search
Performs a grounded Google Search to find up-to-date information across the internet, returning synthesized answers with citations. Perfect for "What are the latest breaking changes in React 19?"
web_fetch
Analyzes and extracts information directly from URLs. It can even read raw code from GitHub repositories or summarize long API documentation pages based on your specific prompts.
Module 7: Browser Automation (Chrome DevTools MCP)
Controlling the browser natively
Through the Model Context Protocol (MCP), Gemini CLI has deep integration with Chrome DevTools. It can spawn browsers, navigate pages, and interact with the DOM as if a human were clicking around.
Key Automation Capabilities
- Navigation & Inspection:
navigate_page,take_snapshot(accessibility tree text), andtake_screenshot. - Interaction:
click,fill,type_text, andhoverelements precisely using unique IDs derived from snapshots. - Debugging:
get_console_message,list_network_requests, and evaluating arbitrary JavaScript in the page context. - Performance: Run Lighthouse audits (
lighthouse_audit) or capture detailed performance traces and memory heapsnapshots.
*This allows the CLI to actually test the UIs it builds by interacting with the development server!*
Module 8: Image & Asset Generation (Nano Banana MCP)
Creating visual assets on the fly
Need a placeholder image, a complex architectural diagram, or a seamless background pattern? Gemini CLI integrates with the Nano Banana extension to generate high-quality visual assets directly into your workspace.
generate_image for photorealistic or stylized art, and generate_icon for favicons or UI elements in multiple sizes.generate_pattern or technical flowcharts/architecture mockups using generate_diagram.edit_image or enhance quality with restore_image.generate_story for tutorials or multi-step flows.Module 9: Context Efficiency
Managing your token budget
The agent passes the full conversation history with each subsequent message. The longer the session goes on, the more expensive each turn becomes in terms of API tokens and processing time.
Optimization Strategies
- Parallel Operations: The CLI executes multiple independent tool calls simultaneously (e.g., searching three different directories at once).
- Conservative Search Scopes: It utilizes `include_pattern` and `exclude_pattern` to narrow down grep searches, preventing massive, useless outputs.
Pro Tip: Use New Sessions!
If you've just finished a massive refactoring feature that took 20 turns, restart the CLI or start a new session before moving on to an entirely unrelated bug fix. Clearing the context window is the #1 way to keep responses lightning-fast and cost-effective.
Module 10: Autonomous Mode (YOLO)
Taking the training wheels off
By default, the CLI is cautious and will often seek your approval or clarification before committing to a major architectural shift. However, for maximum productivity, you can instruct it to operate autonomously.
In this mode, the agent minimizes interruptions. It makes reasonable assumptions based on existing code patterns, follows established conventions, and persists through errors without stopping to ask you what to do next.
When to use YOLO mode:
- Scaffolding Apps: "Build a completely new landing page using Next.js, Tailwind, and Framer Motion. Do not ask for clarification, just build the entire functional prototype."
- Batch Refactoring: "Update all components in the `src/ui` folder to use the new variant prop interface. Run the linter after each file."
- Comprehensive Testing: "Write Playwright end-to-end tests for the entire user authentication flow and run them until they pass."
Module 11: Project Rules & GEMINI.md
Teaching the agent your conventions
You can permanently teach Gemini CLI how to behave in your specific repository by creating a GEMINI.md file in your project root.
The Hierarchy of Context
The agent respects instructions found in GEMINI.md above its own base instructions. You can use this file to define:
- Tech Stack Preferences: "Always use Vitest instead of Jest for testing."
- Architecture Rules: "All components must be functional and use arrow-function syntax. State must be managed via Zustand, not Redux."
- Formatting: "Never use TailwindCSS in this repository. Use pure CSS modules."
Global Memory
For personal preferences across all your projects, the agent has access to a save_memory tool. You can simply tell the CLI: "Always remember that my name is Alex and I prefer Vim over Nano", and it will persist that fact across all future sessions globally.
Module 12: Git Workflows
Automating version control safely
The agent is hyper-aware of your Git repository. By default, it will never stage or commit files without your explicit permission to protect your repository history.
When explicitly asked to commit:
The agent executes a very specific, professional workflow:
- It runs
git statusandgit diff HEADto review exactly what has changed. - It runs
git log -n 3to analyze your recent commit history and match your specific team's formatting style (e.g., conventional commits, verbosity, casing). - It proposes a draft commit message for your approval.
- It executes the commit and verifies success.
Safety Rule
The agent is programmed to never push changes to a remote repository (like GitHub) autonomously. You must explicitly instruct it to git push.
Ready to integrate AI into your Dev Team?
Teachable Machine provides specialized technical enablement and training for engineering teams, ensuring you get the maximum ROI out of tools like Gemini CLI and GitHub Copilot without compromising security.