How the Crop Pollination Lookup works

Methodology and verification — version 2.0, May 2026

What this tool is, and what it isn’t

The Madhukosha Crop Pollination Lookup answers one question: for any crop grown in India, what is the role of pollinators — and of Apis cerana indica, the native Indian honey bee, in particular — in producing it?

It covers 272 Indian crops across two datasets. Every claim about a crop is backed by a source, and the kind of source is shown to you, not hidden.

It is not a yield-prediction system, not a farm-management app, and not a product catalogue. It does not tell you how many hives to keep, what to plant, or what to spray. It tells you what published research says about the pollination biology of a crop — and it is honest about where that research is thin.

The principle behind it is simple: Indian crops, in Indian conditions, deserve Indian field evidence. The dataset is built from research carried out in the Indian subcontinent — ICAR institutes, State Agricultural Universities, and peer-reviewed studies of Indian crops in Indian fields. Where the evidence is a specific, checkable finding, a specific paper is cited. Where the claim is settled textbook biology, the responsible institution is named instead. Both are shown plainly, and the difference between them is the subject of most of this page.

The two datasets

The tool answers different questions for different kinds of crops, and its structure reflects that.

Flower-to-food crops — 199 entries. Crops where a flower develops into the part we eat: a fruit, grain, pod, seed, oilseed, or edible flower bud. For these, pollination is part of how the harvest forms — an unpollinated mango blossom does not become a mango — so each entry carries a pollination-dependency classification.

Non-pollination crops — 73 entries. Crops where the harvested part is not flower-derived: roots like potato and carrot, leaves like spinach and curry leaf, stems like sugarcane, tubers, and mushrooms. The plant may still flower, but its flowers have nothing to do with the harvest. These entries are not given a dependency score; they explain what the crop is and why pollination is not the relevant question.

Both kinds are included deliberately. A tool that is honest about pollination has to be honest when pollination is not the issue — otherwise it quietly fails the moment a farmer searches for “potato,” and that failure would undermine trust in every other answer. Some plants appear in both datasets: banana fruit is flower-to-food, banana stem is non-pollination; onion seed needs bee pollination, the onion bulb does not. Each such entry links to its counterpart.

Where the evidence comes from

Every classification in the flower-to-food dataset traces to a real, checkable source, and the sourcing follows one rule.

A source is admissible if the research was carried out in the Indian subcontinent and it is either published by an Indian institution — an ICAR institute, a State Agricultural University, an AICRP programme — or published in a reputable, peer-reviewed journal. What matters is where the research was done and whether it can be checked — not the nationality of the authors or the country of the journal.

This is a deliberate position. The tool is about Indian crops, Indian conditions, and Indian pollinators, and it should rest on research conducted in those conditions. A coffee-pollination study from Indonesia, however rigorous, describes a different place. At the same time, good research on Indian crops is sometimes published in international journals or co-authored with collaborators abroad — and excluding it for that reason alone would throw away exactly the field evidence the tool needs. The test is geographic: was the research done here.

Sources the tool does not use: global databases and review papers with no Indian field data, and general web references.

How a crop’s pollination is described

Each flower-to-food crop carries three separate pieces of information. They answer different questions and should not be run together.

How it is pollinated — the mechanism: insect-pollinated, wind-pollinated, self-pollinated, mixed, and a few less common modes. This is botany — how pollen actually moves for that crop.

How much its yield depends on pollinators — the dependency tier, in four levels. Highly dependent: without insects the crop loses most of its yield. Moderately dependent: insects substantially raise yield, though the crop still sets some fruit without them. Slightly dependent: pollinators help a little, but most of the crop sets fruit regardless. Not dependent: the crop self-pollinates or is wind-pollinated, and bees are not needed. A crop’s tier reflects what the research shows about yield when pollinators are absent — measured, where studies exist, by caging flowers off from insects and comparing against open pollination.

What the native Indian honey bee does — the role of Apis cerana indica specifically. A crop can be insect-pollinated and still not depend on this particular bee; another insect may do most of the work. So this is kept separate: primary pollinator, one of several, a minor visitor, not a known pollinator, or — where the crop’s natural pollinator is absent from India — hand-pollinated. A sixth state, contested, is explained further below.

The two-tier citation system

Not every fact needs the same kind of source, and pretending otherwise would be false precision. Every citation in the flower-to-food dataset is tagged Tier 1 or Tier 2.

Tier 1 is a specific paper. It is required wherever the tool makes a specific, quantitative, or contested claim — a yield-loss figure, a finding that corrects a common misconception, any number a reader might reasonably challenge. For these, a named study is cited: authors, year, journal, and a link where one exists.

Tier 2 is institutional attribution. Much of what the tool states is settled, textbook biology — that rice is wind-pollinated, that cluster bean is largely self-pollinating. For claims like these, singling out one paper would be arbitrary: the fact belongs to the field, not to a study. So the tool names the institution that is the recognised authority for that crop — ICAR-IIHR for horticulture, ICAR-IIPR for pulses, ICAR-IISR for spices, and so on. Tier 2 is not a weaker Tier 1. It is the honest citation for a different kind of claim.

Every citation is tagged individually, so a reader can see at a glance what kind of evidence sits behind each statement. Each crop’s page carries its own full citation list — that per-crop list is where you trace what a given classification rests on. Non-pollination entries carry no tier markers: they make category statements (“the part eaten is the tuber”), not pollination claims that need this kind of verification, and the tier system does not apply to them.

The honest picture, and how the dataset was verified

Here is the part the tool will not dress up. Across the 199 flower-to-food entries there are 265 citations: 29 are Tier 1, and 236 are Tier 2. Only 25 entries — about one in eight — carry any paper-level Tier-1 backing at all. The other seven in eight rest on institutional attribution.

This is what an honest reading of Indian pollination research looks like. For a small number of well-studied crops — apple, mustard, large cardamom, a few others — Indian peer-reviewed pollination science is rich and directly attributable. For most crops it is not: the relevant institute exists, has worked on the crop’s agronomy and varieties for decades, and is the national authority — but specific peer-reviewed studies of the crop’s pollination and the role of Indian bees are sparse or absent. This is especially true for underutilised regional fruits, minor pulses, and hill and Northeast crops. The tool does not inflate this away. A reasonable, textbook-supported claim with an institutional citation is Tier 2, and it is labelled as Tier 2.

The dataset reached this state through a verification audit. Every citation that had been marked Tier 1 — 72 of them at the start — was checked against the document it named, on three tests: that the document genuinely exists; that it actually supports the specific claim attached to it; and that it meets the sourcing rule above. The audit ran in batches, each one reviewed before any change was applied, and all changes were then applied together in a single pass, with automated checks confirming every citation ended where the review had placed it.

Of those 72 Tier-1 citations: 35 were re-tiered to Tier 2, because what they supported was textbook biology rather than a specific contested claim; 13 were removed, because the cited document could not be confirmed or did not support the claim it was attached to; and 24 were kept. Seven new Tier-1 papers were found by re-sourcing entries that needed better evidence. Five dependency estimates that were more confident than the Indian evidence supported were reconciled to what the research actually shows — the recurring error being that self-fertile crops had been overstated. A later follow-up pass closed the remaining flagged items. The full audit trail — every citation removed, re-tiered, or corrected, with the reason — is kept on record.

One finding from that audit is worth stating plainly, because it shaped the result: the dataset’s least reliable field was the one naming which institution stood behind a citation. It was wrong often enough that verification now treats it as unverified until checked against the actual paper. The tiering described here is the outcome of doing that checking.

Three worked examples

Why a foreign co-author does not disqualify a study — robusta coffee. Robusta coffee’s classification rests on a study of pollination in coffee agroforests in Kodagu, in the Western Ghats, measuring fruit set under insect, wind, and hand pollination. The study has a co-author based at a university abroad. Under a rule that tested author nationality, it would have been excluded. Under the rule the tool actually uses, it is admissible — and is the entry’s Tier-1 source — because the research was carried out in the Indian subcontinent, on an Indian crop, in Indian conditions. That is the test that matters.

Why some entries have only Tier-2 citations — cluster bean. Cluster bean (guar) is one of India’s most globally significant crops — India grows the large majority of world supply — and ICAR-IIPR, CAZRI, IGFRI and the AICRP on Arid Legumes have all worked on it for decades. Its pollination biology is well understood: it is predominantly self-pollinating, with some bee-aided outcrossing. But when the audit looked for an Indian peer-reviewed paper documenting that specific pollination behaviour, none existed. The IGFRI document is about forage variety development; the journal paper is about the seed-production chain — both real, both authoritative, both about something other than pollination. The entry sits, honestly, at three Tier-2 citations. The claim is sound textbook biology; the paper-level evidence specific to it simply has not been published. This is the genuine situation for a large share of India’s minor crops, and the tool shows it rather than hiding it.

Why some entries show two opposing sources — large cardamom. For large cardamom, two credible Indian field studies disagree about the role of the native honey bee — one finding it the principal pollinator, the other placing it among several. The tool does not quietly pick a side. The entry’s pollinator role is marked contested, and both studies are shown, so a reader sees the disagreement and its sources for themselves. Madhukosha aggregates research; it does not adjudicate it. Where the science is unsettled, the honest thing is to show that it is unsettled.

The Misconceptions note

Some crops carry a short note correcting a common but mistaken belief about their pollination — for instance, the assumption that a crop behaves like a related crop it is often grown beside. This note appears only where there is a genuine, documented misconception worth correcting. It is not a slot to be filled on every crop; a manufactured “myth” would be worse than none. Where it appears, it is held to the same sourcing standard as everything else on the entry.

Maintenance and provenance

This page describes the dataset as it stands in May 2026. The dataset is not auto-updated against external databases, and citations do not refresh themselves — which is a deliberate choice. A small dataset that is checked by hand is more trustworthy than a large one whose citations decay silently.

Madhukosha commits to maintaining it. As new Indian pollination research appears, and as readers point out errors, the dataset is reviewed and corrected, and changes are dated and described. Classifications are revised when better evidence appears — correction is part of the normal work, not an admission of failure. There is no fixed schedule; the commitment is that the tool is maintained rather than left to drift.

If you believe something here is wrong — a classification, a citation, a tier — tell us, and point us to the source. Corrections that hold up are made with the same checking the dataset was built with. Write to hello@madhukosha.org.