Menu
Menu
Close
Close

HydroYield: Smart Hydroponic Yield Predictor
An intelligent system for hydroponic farming that provides crop recommendations and yield predictions based on real-time environmental data, enabling data-driven decisions in indoor farming to improve efficiency and output.
Overview
HydroYield is an intelligent system for hydroponic farming that provides crop recommendations and yield predictions based on real-time environmental data. It consists of a Flask-based backend that interfaces with IoT sensors in a hydroponic setup. By collecting data like temperature, humidity, pH, etc., and running it through machine learning models, HydroYield helps farmers decide which crops are best suited for the current conditions and how to optimize growing parameters to maximize yield. The goal is to enable data-driven decisions in indoor farming to improve efficiency and output.
From
sensor
data
to
intelligent
crop
recommendations
Key Features
Architecture & Components
HydroYield follows a modular service-oriented architecture for scalability and maintainability. The main components include:
API Layer (Controllers)
Flask RESTful controllers handle HTTP requests and routes. For example, the `sensor_controller` accepts incoming sensor data, and the `prediction_controller` handles endpoints for getting recommendations or predictions. This layer validates input (ensuring the JSON from devices is correct and secure) and then delegates to the appropriate service.
Sensor Service (Data Ingestion)
This component processes incoming sensor readings. It performs tasks such as data validation, normalization (e.g., ensuring units/ranges are consistent), and storing the readings in the database. It may also compute derived metrics (like Vapor Pressure Deficit from temperature and humidity) to enrich the dataset.
ML Service (Predictions)
Encapsulates the machine learning logic for both crop recommendation and yield prediction. When the API requests a prediction (e.g., via `/api/predict`), this service loads the latest trained models to predict the best crop or the expected yield. It uses scikit-learn models under the hood.
Data Service (Analytics)
Provides analytics and historical data retrieval. For instance, it can compute aggregates, trends over time, or fetch historical sensor data for a given device. It also might handle generating summaries for the /api/analyze-crop/{crop} endpoint, analyzing how a particular crop performed historically.
Data Layer
A MongoDB database stores all data. Key collections include `sensor_data` for raw sensor readings (timestamped records of environment data), and possibly others for crops, recommended settings, or user info. MongoDB's flexible schema is useful for evolving sensor schemas or adding new data types.
Utilities & Config
Helper modules (`utils/helpers.py`) provide common functionality like formatting data, sending notifications (if implemented), and input sanitization. A configuration module (`config.py`) centralizes settings such as optimal ranges for each crop's growing conditions, threshold values, database connection strings, API keys, etc.
Modular
architecture
for
scalable
farming
intelligence
Machine Learning Details
HydroYield's intelligence comes from two main ML models:
Crop Recommendation Model
A Random Forest classifier that takes environmental parameters as input and predicts the optimal crop to grow. It was trained on an agricultural dataset (`cpdata.csv`) covering ~22 crop types with features like temperature, humidity, pH, and rainfall. The model achieves high accuracy (~97–99%) in recommending crops suited to the conditions. In practice, this means the system can very confidently pick the right crop given typical sensor readings. Feature importance analysis for this model shows pH and temperature are especially influential factors for the decision.
Yield Prediction Model
A Random Forest regressor that estimates the yield (quantity of produce) for a chosen crop under given conditions. It considers both the crop type and current environmental metrics to output an expected yield (with units like kg or yield index). Additionally, it provides a confidence interval for its predictions, derived from the variance across the Random Forest's trees. This gives farmers not just a single number but a range (lower and upper bounds) indicating uncertainty. If the ML model is unavailable or unsure (e.g. insufficient data), a rule-based fallback system can provide a rough estimate based on how close the conditions are to optimal ranges.
Both models are developed using scikit-learn and can be re-trained with new data. The repository includes scripts for data preprocessing, model training (`ml/train_model.py` and `ml/crop_yield_predictor.py`), and evaluation of model performance. Current results show strong performance, but as more sensor and yield data are gathered from real operations, the models can be further refined.
Security and API
All endpoints require an API key which devices or users must provide (e.g., via a header) to ensure only authorized sources send or request data. Input data is rigorously validated and sanitized to prevent any malformed data or injection attacks. The Flask app also implements proper error handling so that internal errors do not leak sensitive information to clients. This is important since the system could be exposed to untrusted IoT devices. By enforcing API keys and validation, HydroYield maintains a secure interface for external sensors and clients.
The API provides RESTful endpoints to interact with the system. Key endpoints include: submitting sensor readings (POST /api/sensor-data), requesting recommended crops (POST /api/predict), getting a yield estimate (POST /api/predict-yield), retrieving optimal growing conditions for a specific crop (GET /api/optimal-conditions/{crop}), and suggestions for improving current conditions (POST /api/suggest-improvements). There are also endpoints to list supported crops and fetch historical analyses. Responses are returned in JSON. This API design allows easy integration with frontend dashboards or mobile apps in the future.
Secure,
scalable,
and
intelligent
farming
solutions
Installation and Future Enhancements
The project is built with Python 3.8+ and uses Flask, pandas, numpy, scikit-learn, and PyMongo (for MongoDB connectivity) among other libraries. To set up, one would install the dependencies (via `requirements.txt`), configure a MongoDB instance, set environment variables (including an API key and DB connection URI), and then run the Flask app (`app.py`). IoT devices can then start posting data to the running server.
HydroYield is under active development with several future enhancements planned: real-time alert notifications when conditions go out of optimal range, integration with hardware actuators for automated control of the environment (e.g., adjusting lights or pumps automatically), advanced visualization dashboards for the collected data, support for multiple user accounts with roles (e.g., multiple farmers or farm locations), and even a companion mobile application for on-the-go monitoring and control. These additions will further increase the system's value as a comprehensive smart farming assistant.
HydroYield represents a comprehensive solution for modern hydroponic farming, combining IoT sensor integration, machine learning intelligence, and secure API design to enable data-driven agricultural decisions. By providing real-time crop recommendations and yield predictions, the system empowers farmers to optimize their growing conditions and maximize production efficiency. The modular architecture ensures scalability and maintainability, while the local deployment model addresses the unique needs of farm environments with connectivity and data ownership concerns.
