Duplicate and Similarity Check

Prevent fraud by catching repeated and suspiciously similar receipt submissions.

Taggun detects receipts being submitted more than once, even if they have been altered a bit. This helps to keep rewards promotions and expense reports honest.

Key Capabilities

  • Duplicate Detection: Detects repeated submission of identical receipts.
  • Shared Receipt Detection: Mitigates collaborative fraud by identifying suspiciously similar receipts across different users.
  • Modified Receipt Detection: Catches fraudulent attempts even when receipt images have been slightly altered.

How It Works

After the text is extracted from the receipt, our system analyses the data to determine how similar one receipt is to another. Even if users attempt to manipulate the image, our system can flag these suspicious submissions for further investigation.

  1. Data Extraction: Taggun extracts text and supporting data from the image.
  2. Similarity Analysis: Extracted data is compared to other receipts in your account (or campaign for receipt validation endpoints).
  3. Intelligent Scoring: A similarity score is generated, with scores above 0.9 flagged as potentially suspicious.
  4. Merchant-Specific Thresholds: Score thresholds are adjusted per merchant to reduce false positives. This is per merchant to reduce false positives for receipts that are naturally similar.

Setup Process

1. Contact Taggun

Reach out to [email protected] to enable this feature for your account.


2. Update API Requests

Data Extraction API Endpoints

No changes are required. This feature is automatically enabled once activated by Taggun.


Receipt Validation API endpoints

Include the following optional field in your request:

   "fraudDetection": {
       "allowSimilarityCheck": true
   }

3. Submit an API Request

When submitting a receipt for validation, you can include these optional fields for enhanced tracking and fraud detection:

{
  "referenceId": "your_unique_submission_id",
  "userId": "your_system_user_id"
}

Understanding Request Parameters

Field NameTypeDescription
referenceId (optional)stringYour system's unique ID for tracking receipts.

Prevents false positives for legitimate resubmissions, as receipts with the same referenceId are not flagged as suspicious. If omitted, Taggun generates a trackingId. Using referenceId offers better integration and control over receipt management.
userId (optional)stringYour system's unique user identifier.

Useful for tracking suspicious behavior across users. E.g., the same receipt being uploaded by different users.

Understanding Response Properties

Field NameTypeDescription
trackingIdstringTaggun's unique ID for the receipt.

Used to track receipts within Taggun's system if no referenceId is provided. When used with referenceId, trackingId serves as an additional identifier, enhancing duplicate detection and resubmission management across both systems.
entities.similarReceiptsarrayA list of similar receipts, if any are found.
entities.similarReceipts[index].scorenumberSimilarity score (0-1). Higher score indicates greater similarity.
entities.similarReceipts[index].trackingIdstringTaggun's trackingId for the similar receipt.
entities.similarReceipts[index].referenceIdstringThe referenceId of the similar receipt (if available).
entities.similarReceipts[index].userIdstringThe userId associated with the similar receipt (if available).

Example Response


When no similar receipts are found:

{
  "trackingId": "T-20241001-8439425",
  "entities": {
    "similarReceipts": []
  }
}


When similar receipts are found:

{
  "entities": {
    "similarReceipts": [
      {
        "referenceId": "RCPT-2024-10-07-001",
        "userId": "USER12345",
        "trackingId": "T-20241001-8439425",
        "similarityScore": 0.95
      },
      {
        "referenceId": "RCPT-2024-10-07-002",
        "userId": "USER67890",
        "trackingId": "T-20241001-4801341",
        "similarityScore": 0.95
      }
    ]
  }
}

Receipt Validation Example

A user submits the same receipt twice to claim a reward, and the system flags it as a duplicate. They then try to bypass detection by altering details like the merchant ID and transaction date. Despite the changes, the system detects the similarities and flags them as suspicious, preventing the fraudulent claim.

  "similarReceipts": [
    {
      "referenceId": "REF002",
      "userId": "USER-001",
      "trackingId": "T-20241007-6127420",
      "similarityScore": 1 /*duplicate is found*/
    },
    {
      "referenceId": "REF003",
      "userId": "USER-001",
      "trackingId": "T-20241007-6748290",
      "similarityScore": 0.95 /*slightly modified fraudulent attempt is found*/
}


Common Scenarios

📘

Here are some typical scenarios you may encounter.

See some real-world examples of how Taggun’s similarity detection system identifies different scenarios, ranging from harmless duplicates to sophisticated fraud attempts.

Note that Taggun is equipped to handle many more cases beyond the examples provided here.


1. Two different images of the same receipt

This scenario occurs when two users submit different photos of the same physical receipt. It could be an innocent mistake (e.g., family members both submitting a shared receipt) or a deliberate attempt to claim multiple rewards.

The two images submitted

Two different images of the same receipt


Response

{
  "trackingId": "T-001",
  "entities": {
    "similarReceipts": [
      {
        "referenceId": "R-001",
        "userId": "U-001",
        "trackingId": "T-002",
        "similarityScore": 1
      }
    ]
  }
}


2. Attempted bypass with slight modification

In this case, a user tries to modify the receipt slightly (e.g., changing the transaction date) to bypass detection. The system still detects the high similarity.

Attempted bypass with slight modification (date)

Attempted bypass with slight modification (date)


Response

{
  "trackingId": "T-003",
  "entities": {
    "similarReceipts": [
      {
        "referenceId": "R-002",
        "userId": "U-002",
        "trackingId": "T-004",
        "similarityScore": 0.98
      }
    ]
  }
}

3. Multiple similar submissions by the same user

This scenario shows a single user submitting multiple similar receipts. It could be accidental (e.g., forgetting about a previous submission) or an attempt to claim multiple rewards.

3 duplicate submissions by the same user

3 duplicate submissions by the same user

Response

{
  "trackingId": "T-005",
  "entities": {
    "similarReceipts": [
      {
        "referenceId": "R-003",
        "userId": "U-003",
        "trackingId": "T-006",
        "similarityScore": 1
      },
      {
        "referenceId": "R-004",
        "userId": "U-003",
        "trackingId": "T-007",
        "similarityScore": 1
      }
    ]
  }
}

4. Multiple similar submissions by different users

This indicates a potential syndicated fraud attempt, where multiple users coordinate to submit similar receipts. This could be an organised effort to exploit.

Multiple similar submissions by different users

Multiple similar submissions by different users

Response

{
  "trackingId": "T-008",
  "entities": {
    "similarReceipts": [
      {
        "referenceId": "R-005",
        "userId": "U-004",
        "trackingId": "T-009",
        "similarityScore": 0.97
      },
      {
        "referenceId": "R-006",
        "userId": "U-005",
        "trackingId": "T-010",
        "similarityScore": 0.96
      },
      {
        "referenceId": "R-007",
        "userId": "U-006",
        "trackingId": "T-011",
        "similarityScore": 0.95
      }
    ]
  }
}

5. Incidental Similarities (e.g., daily coffee purchases)

This scenario represents similar but legitimate receipts, such as purchases from a favourite fastfood joint. These are flagged due to their similarity but are likely genuine transactions.

Incidental, Innocent Similarities

Incidental, Innocent Similarities

Response

{
  "trackingId": "T-012",
  "entities": {
    "similarReceipts": [
      {
        "referenceId": "R-008",
        "userId": "U-007",
        "trackingId": "T-013",
        "similarityScore": 0.82
      }
    ]
  }
}

Use Cases

  • Promotions: Ensure each participant submits unique, valid receipts.
  • Expense Management: Prevent duplicate reimbursement claims.

FAQs

Q: Will Taggun flag two different people submitting the same receipt?

A: Yes. Including the userId in the request facilitates investigation of potential fraudulent behavior across users.

Q: What if receipts from a specific merchant are always very similar?

A: Taggun's engine dynamically adjusts similarity thresholds based on merchant-specific patterns, reducing false alarms for legitimate submissions.


Best Practices

  • Always include a referenceId to avoid flagging legitimate resubmissions.
  • Use the userId field to track potential collaborative fraud attempts.
  • Implement a review process for receipts flagged as suspicious.

Getting Starts

Contact us at [email protected] to enable Similarity Check on your account.