HAT Public Guide

HAT (Handwriting Analysis Tool) is an explainable, parameter-based handwriting comparison platform designed for screening and research workflows. It helps users interpret and report computational indicators in a structured way.

Built with the StartupZila ecosystem for practical product execution and responsible usability.

Disclaimer: HAT outputs are support indicators and not a standalone forensic/legal identification decision. Any serious conclusion requires qualified expert review, complete evidentiary protocol, and explicit limitations.

1. Integrated Handwriting Handbook

Comprehensive handbook integrated from the master HAT handwriting documentation, covering interpretation, weights, reliability, and usage guidance.

1. OVERVIEW

HAT is an AI-assisted handwriting feature extraction and comparative analysis platform used for:

- Handwriting comparison

- Structural handwriting analysis

- Feature extraction

- Academic interpretation

- AI-assisted forensic support

Processing stages generally include:

1. Image preprocessing

2. Noise reduction

3. Character segmentation

4. Stroke extraction

5. Feature normalization

6. Comparative scoring

7. Weighted matching

2. VALUE TYPES

Different parameters may represent:

- Normalized scores

- Weighted scores

- Geometric measurements

- Confidence scores

- Categorical encodings

- Binary presence detection

3. ADVANCED WEIGHT SYSTEM

Weight meaning:

- 0.5 = Low importance

- 0.8 = Reduced influence

- 1.0 = Default influence

- 1.1–1.3 = Higher importance

- 1.5+ = Highly influential

- 2.0 = Extremely influential

4. PARAMETER DOCUMENTATION (Guide Intro)

The parameter guide explains what each enabled trait represents, the indicative category behavior, the detection basis, and the recommended weight.

Use the most reliable traits first, then enable individual cues when visible in both samples.

4.1 WRITING STYLE

Purpose: Detects handwriting flow type.

Possible categories: Print, Semi-Cursive, Cursive, Hybrid.

Detection basis: Character connectivity, stroke continuity, joining frequency.

Recommended weight: 1.0

4.2 CONNECTING STROKES

Measures letter connectivity frequency.

Typical interpretation:

0.0-0.2 = Mostly disconnected

0.2-0.5 = Limited connection

0.5-0.8 = Semi-connected

0.8-1.0 = Fully cursive

Recommended weight: 1.0

4.3 LINE QUALITY

Measures smoothness and consistency.

Factors: Stroke smoothness, edge continuity, ink consistency.

Recommended weight: 1.0

4.4 HEIGHT UPPERCASE

Measures uppercase letter height.

Categories: Small, Medium, Large.

Recommended weight: 1.0

4.5 HEIGHT LOWERCASE SHORT

Measures middle-zone lowercase height.

Examples: a, e, o, n.

Recommended weight: 1.0

4.6 HEIGHT LOWERCASE TALL

Measures upper/lower extensions.

Examples: l, h, t, g, y.

Recommended weight: 1.0

4.7 HEIGHT OVERALL

Measures total writing scale.

Recommended weight: 1.0

4.8 SPACING LETTERS

Measures distance between letters.

Categories: Tight, Medium, Wide.

Recommended weight: 1.0

4.9 SPACING WORDS

Measures average inter-word spacing.

Recommended weight: 1.0

4.10 SPACING LINES

Measures vertical distance between lines.

Categories: Dense, Balanced, Wide.

Recommended weight: 1.0

4.11 BASELINE ALIGNMENT

Detects writing alignment consistency.

Categories: Ascending, Descending, Straight, Wavy.

Detection basis: Baseline extraction; regression slope analysis.

Recommended weight: 1.2

Reliability: Very High

4.12 SLANT

Measures writing angle orientation.

Approximate categories:

-45 to -10 deg = Left slant

-10 to 10 deg = Vertical

10 to 45 deg = Right slant

Recommended weight: 1.2

Reliability: Very High

4.13 PEN PRESSURE

Estimates writing pressure intensity.

Detection basis: Stroke darkness, pixel density, edge thickness.

Categories: Light, Medium, Heavy.

Recommended weight: 1.0

Reliability: Medium

4.14 TREMORS

Measures shakiness and instability.

Detection basis: Stroke jitter, edge oscillation.

Recommended weight: 1.0

4.15 LEFT MARGIN / RIGHT MARGIN

Measures page alignment consistency.

Categories: Consistent, Irregular, Expanding, Narrowing.

Recommended weight: 0.8

4.16 EMBELLISHMENT UPPERCASE / LOWERCASE

Detects decorative handwriting formations.

Examples: Decorative loops, extended strokes, stylized capitals.

Recommended weight: 1.2

4.17 LOOP UPPER ZONE

Measures loop structure in: l, h, b.

Features: Loop width, loop closure, loop angle.

Recommended weight: 1.1

4.18 LOOP LOWER ZONE

Measures lower-zone loops in: g, y, j.

Recommended weight: 1.1

4.19 DOUBLE LOOP

Detects dual-loop formations.

Recommended weight: 1.2

4.20 DIACRITIC I

Analyzes dot placement behavior.

Features: Dot position, dot height, dot spacing.

Recommended weight: 1.3

Distinctiveness: Very High

4.21 DIACRITIC T

Analyzes t-bar characteristics.

Features: Cross height, cross angle, cross length.

Recommended weight: 1.3

Distinctiveness: Very High

4.22 EYELET

Detects enclosed eye-shaped formations.

Recommended weight: 1.1

4.23 UNUSUAL FORMATION

Detects rare handwriting constructions.

Examples: Unique letter structures, rare stylization, non-standard strokes.

Recommended weight: 1.3

Distinctiveness: Extremely High

5. MOST RELIABLE PARAMETERS

Very High Reliability: Slant; Baseline Alignment; Diacritic T; Diacritic I; Writing Style.

Medium Reliability: Spacing; Height; Loops; Embellishments.

Lower Reliability: Pen Pressure; Tremors; Line Quality.

6. RECOMMENDED MATCHING CONFIGURATION

Suggested weights:

Slant = 1.2

Baseline Alignment = 1.2

Diacritic I = 1.3

Diacritic T = 1.3

Writing Style = 1.0

Connecting Strokes = 1.0

Loop Features = 1.1

Unusual Formation = 1.3

Margins = 0.8

7. CONDITIONS AFFECTING ACCURACY

Accuracy may reduce due to:

- Low DPI scans

- Blur

- Shadows

- Compression artifacts

- Poor lighting

- Scanner quality

- Different pens/paper

8. RECOMMENDED SCANNING SETTINGS

Resolution: 300 DPI or higher

Format: PNG preferred

Background: Plain white

Orientation: Straight

Compression: Minimal

9. SIMILARITY SCORE INTERPRETATION

90-100% = Very High Similarity

75-89% = Strong Similarity

60-74% = Moderate Similarity

40-59% = Weak Similarity

Below 40% = Low Similarity

Note: This is a practical reporting interpretation band. Always include confidence band and quality context from HAT outputs.

10. LIMITATIONS

The HAT system is AI-assisted and should not be treated as a standalone forensic authority.

Possible limitations:

- Image quality dependency

- Intentional disguise

- Writing mood variability

- Small sample size

- Scanner artifacts

11. ACADEMIC USAGE NOTE

Recommended terminology:

"The HAT system performs multi-parameter handwriting feature extraction and weighted comparative analysis using geometric, structural, and stylistic handwriting characteristics derived from digitized handwritten samples."

2. Handwriting Theory and Interpretation Guide

Theory-first, English-only guide explaining how HAT thinks about handwriting, feature families, reliability, and responsible interpretation.

2.1 Conceptual Overview

HAT supports handwriting comparison through a pipeline of image preparation, feature extraction, weighted comparison, and confidence-oriented interpretation.

Instead of jumping into formulas, this section focuses on how the system behaves conceptually: what it looks for in handwriting and how it summarises similarity into a single screening indicator.

2.2 Parameter Families

Class-style traits describe broad writing behavior shared by many writers (for example general slant, spacing rhythm, baseline behaviour, margins, and overall size).

Individual-style traits capture more personal tendencies such as how i-dots and t-bars are placed, how loops and embellishments are shaped, and whether unusual formations regularly appear.

For stable interpretation you typically want class traits to give you “overall agreement” and individual traits to add extra strength when they consistently match.

2.3 Accuracy and Responsibility

Result reliability depends heavily on input quality (blur, lighting, crop style), consistency between the two images, and honest reporting of which parameters and weights were used.

HAT is designed as a screening / research support tool; any strong statement about authorship should always include expert review, additional evidence, and a clear limitation note.

3. Parameter Selection Accuracy Playbook

Complete practical guide, in English, for which parameters to enable, how to set weights, and how to keep comparison results as reliable as possible.

3.1 What to Use

For general screening, rely on broad and relatively stable traits such as slant, baseline alignment, word and line spacing, overall size, line quality and approximate pressure, plus left/right margins.

For writer-specific analysis, layer in diacritics (i-dots and t-bars), loops, embellishments, eyelets and unusual formations — but only when they are clearly visible in both images and not dominated by noise.

3.2 How to Tune Weights

Use 0.5 to 2.0 as influence control: 0.5 means “soft influence”, 1.0 means “normal influence”, 1.5 means “strong influence”, and 2.0 means “very strong influence” for that parameter in the final distance.

Change only a few parameters per run, keep most weights between 0.8 and 1.4, and avoid pushing many traits above 1.7 at the same time unless you have a very clear reason.

3.3 Accuracy Workflow

Start with default parameters and weights, run a comparison, and note quality scores and which traits show large mismatches.

Then choose one profile (balanced / individual-emphasis / noisy-conservative), adjust a few weights, rerun, and keep the configuration that gives the most stable and explainable result.

For reporting, always list which parameters were enabled, which weights were customised, what confidence band you obtained, and include a short limitation statement.

4. Advanced Filters and Scoring Guide

Deep-dive explanation of how advanced filters work in HAT: parameter selection, 0.5–2.0 weights, 0–100 scores, and how the system decides low / medium / high confidence.

4.1 Advanced Filters

Advanced filters combine two dials you can control: which handwriting parameters are included in the calculation, and how much influence each one has via its weight.

Choosing the right subset lets you focus the comparison on traits that are visible, relevant and trustworthy for your case instead of blindly using everything.

4.2 Weight Range (0.5 to 2.0)

0.5 tells the system “treat this trait gently; even if it mismatches, do not overreact”, while 1.0 keeps the default influence and 2.0 says “if this trait disagrees strongly, let it strongly pull the score down”.

Weights never change the raw measurements themselves; they only change how much that trait contributes when all trait mismatches are blended into one overall distance.

4.3 0 to 100 Scores and Categories

Many values are mapped into a 0–100 range simply so that you can visually compare traits on the same scale; internally the tool still remembers raw geometry and proxy values.

The final similarity is also expressed on a 0–100 style scale, but what matters more is the confidence band: high (>=68), medium (>=42 and <68), or low (<42), which should always be read together with scan quality and parameter choices.

5. Project Introduction and Dissertation Note

High-level English narrative describing why HAT was created, what market gap it addresses, the method it follows, and how it fits into responsible forensic support.

5.1 Purpose

HAT is built as a deterministic and explainable handwriting comparison platform for screening and research workflows.

It addresses a market gap between slow manual analysis and black-box systems that are hard to audit.

5.2 Why It Exists

To provide transparent parameter-level comparison, reproducible scoring, and practical operational usability for early-stage teams and institutions.

The system is intended to assist analysis, not replace final forensic judgment.

5.3 Forensic Position

HAT is a support tool for triage, consistency, and research reporting.

Final legal or forensic conclusions should remain with qualified experts and full evidentiary protocol.

6. Technical Interpretation Reference

Official interpretation rules for the main response fields, 0–100 mappings, confidence bands, and roles/responsibilities when using HAT in serious contexts.

6.1 Output Meaning

`metrics.value` is raw/proxy extraction value.

`metrics.score` is normalized display score (0-100).

`deltas` are pairwise dissimilarities (0-100), lower means closer.

6.2 Confidence and Fusion

Final `same_writer_probability` is harmonic fusion of two independent signals: metric probability (CV metrics) and structural similarity (SSIM/edges/histogram).

The fused score is used to decide a confidence band:

- high: >= 68

- medium: >= 42 and < 68

- low: < 42

Per-parameter values shown in the report are descriptive; the final fused confidence depends on how raw feature mismatches combine across parameters plus the structural branch.

6.3 Compliance Guardrail

Public reporting must not claim legal certainty from score alone.

Always include method, limitations, selected parameters/weights, and quality context.

7. Scoring Model

Raw-metric dissimilarity + structural branch + harmonic fusion.

7.1 Metric Branch

Step 1: extract raw/proxy metrics for each selected parameter from both images (A and B).

Step 2: compute per-parameter pairwise dissimilarity (`deltas`) in a 0–100 scale.

Step 3: compute `weighted_distance` as a weighted mean of these `deltas` (lower means more similar).

Step 4: convert `weighted_distance` into `metric_probability` (0–100) using an exponential decay mapping.

Intuition: if mismatches are small across the selected traits, the exponential mapping yields a higher metric probability.

7.2 Structural Branch

Compute independent similarity from image structure using a letterboxed comparison.

Components (combined into 0–100 structural similarity):

- grayscale SSIM (48%)

- edge SSIM on dilated Canny edges (32%)

- gradient-magnitude histogram correlation (20%, correlation mapped from [-1,1] to [0,1]).

Intuition: this branch rewards overall visual similarity even if some scalar trait estimates vary.

7.3 Final Fusion (Harmonic)

Compute fused `same_writer_probability` using harmonic fusion of `metric_probability` and `structural_similarity`.

Harmonic fusion penalizes cases where only one branch is high and the other is low, reducing false “high match” claims.

The confidence band labels (high/medium/low) are then assigned from the fused score thresholds.

8. Calibration and Accuracy Protocol

Dataset standards, tuning loop, quality gates, and reporting requirements.

8.1 Dataset

Use balanced same-writer and different-writer pairs with realistic capture variability.

Maintain a locked validation split for unbiased evaluation.

8.2 Reporting

Track confusion matrix, precision/recall/F1, and error examples.

Record threshold and weight updates for release traceability.

9. Parameters Guide

Human guide for class and individual characteristics in HAT.

9.1 Class Characteristics

Includes broad traits such as slant, spacing, baseline alignment, margins, pressure proxies.

Useful for broad screening and pattern grouping.

9.2 Individual Characteristics

Includes diacritics, loops, embellishments, eyelets, and unusual formations.

Useful for stronger person-specific cues with caution on proxy limitations.