Reference¶

to be completed

For more information, see the glossary.

Analytical attacks ¶

chi2 attack ¶

sealwatch.chi2.attack(spatial)¶

Measures the “distance” between the observed histogram and a typical histogram after LSB replacement.

LSB replacement (embedding rate = 1) averages the neighboring histogram bins.

Parameters:: spatial (np.ndarray) – image pixels, of arbitrary shape
Returns:: distance and p-value, distance is the chi2 test statistic between the observed histogram and the stego model. A small distance means that the image matches the model (e.g., because it was embedded with LSB replacement). The p-value turns the score into a probability. A p-value of 0 means that the image contains no steganography. p-value of 1 indicates that the image contains LSBR steganography.
Return type:: Tuple[float]

SPA ¶

sealwatch.spa.attack(x0)¶

Run sample-pair analysis.

Parameters:

cover_spatial (np.ndarray) –
x0 (ndarray) –

Returns:

embedding rate estimate

Return type:

float

Example:

>>> spatial = np.array(Image.open('suspicious.png'))
>>> alpha_hat = sw.spa.attack(spatial)
>>> assert alpha_hat == 0

WS ¶

sealwatch.ws.attack(x1, pixel_predictor='KB', correct_bias=False, weighted=True)¶

Runs weighted stego-image (WS) steganalysis on a given image.

The goal of WS steganalysis is to estimate the embedding rate of uniform LSB replacement embedding.

Parameters:

x1 (np.ndarray) –
pixel_predictor –
correct_bias (bool) –
weighted (bool) –

Returns:

change rate estimate

Return type:

float

sealwatch.ws.unet_estimator(*args, **kw)¶

Histogram attack ¶

sealwatch.F5.attack(y1, qt, **kw)¶

Runs a histogram attack with cartesian callibration, targetted against F5.

Pools the estimates for the DCT AC modes 01, 10, and 11.

Parameters:

y1 (np.ndarray) – Stego DCT coefficients.
qt (np.ndarray) – quantization table

Returns:

change rate estimate

Return type:

float

Example:

>>> beta_hat = sw.F5.attack(jpeg1.Y, jpeg1.qt[0])

RJCA ¶

sealwatch.rjca.attack(y1, qt)¶

Performs RJCA and returns variance.

Rounding error should be around 0.04-0.07. For stego, it grows towards 0.08333 (1/12).

Parameters:

y1 (np.ndarray) – quantized cover DCT coefficients of shape [num_vertical_blocks, num_horizontal_blocks, 8, 8]
qt (np.ndarray) – quantization table of shape [8, 8]

Returns:

variance of the rounding error

Return type:

float

Example:

>>> jpeg = jpeglib.read_dct('suspicious.jpeg')
>>> var = cl.rjca.attack(
...         y1=jpeg.Y,
...         qt=jpeg.qt[0],
    ... )
>>> assert np.abs(var - 1/12.) > .005

Handcrafted features ¶

HCF-COM ¶

sealwatch.hcfcom.extract(x1, *, order=1)¶

Parameters:

x (np.ndarray) –
order –
x1 (ndarray) –

Returns:

Return type:

OrderedDict

Example:

>>> # TODO

sealwatch.hcfcom.extract_from_file(path, **kw)¶

Parameters:: path (Union[str, Path]) –
Return type:: Dict[str, ndarray]

SPAM ¶

sealwatch.spam.extract(x1, *, T=3, rounded=False)¶

Extract 2nd-order spatial adjacency model (SPAM) features. The implementation merges over image directions.

The final feature set has 686 dimensions.

Parameters:

x1 (np.ndarray) – 2D ndarray
T (int) – truncation threshold
rounded (bool) –

Returns:

ordered dict containing 686 feature dimensions in total.

Return type:

collections.OrderedDict

Examples:

>>> features = sw.spam.extract(x1)

By default, this function uses Rust-accelerated backend. To use the (substantially slower) Python implementation, type

>>> with sw.BACKEND_PYTHON:
>>>     features = sw.spam.extract(x1)

sealwatch.spam.extract_from_file(path, *, rounded=True, **kw)¶

Extract SPAM features from luminance channel of given JPEG image

Parameters:

path (str or pathlib.Path) – JPEG image to be analzed
rounded (bool) –

Returns:

ordered dict with the feature values

Return type:

collections.OrderedDict

Example:

>>> # TODO

This function can only work with Python backend.

SRM ¶

sealwatch.srm.extract(x, *, qs=[[1, 2], [1, 1.5, 2], [1, 1.5, 2], [1, 1.5, 2], [1, 1.5, 2]], directional=True)¶

Extracts spatial rich model for steganalysis.

Parameters:

x (np.ndarray) – 2D input image
qs (List[List[int]]) –
directional (bool) –

Returns:

structured SRM features

Return type:

OrderedDict

sealwatch.srm.extract_from_file(path, **kw)¶

Parameters:: path (Union[str, Path]) –
Return type:: Dict[str, ndarray]

sealwatch.srmq1.extract(x, **kw)¶

Extracts spatial rich model for steganalysis.

Parameters:: x (np.ndarray) – 2D input image
Returns:: structured SRMQ1 features
Return type:: collections.OrderedDict

sealwatch.srmq1.extract_from_file(path, **kw)¶

Parameters:: path (Union[str, Path]) –
Return type:: Dict[str, ndarray]

CRM ¶

sealwatch.crm.extract(x, *, q=1, Tc=2, implementation=Implementation.CRM_FIX_MIN24)¶

Extracts color rich model for steganalysis.

Parameters:

x (np.ndarray) – 2D input image
q (int) –
Tc (int) –
implementation (Implementation) –

Returns:

structured CRM features

Return type:

collections.OrderedDict

sealwatch.crm.extract_from_file(path, **kw)¶

Parameters:: path (Union[str, Path]) –
Return type:: Dict[str, ndarray]

JRM ¶

sealwatch.jrm.extract(y1, *, calibrated=False, qt=None)¶

Extracts JPEG rich models (JRM) for the given DCT coefficients.

Parameters:

y1 (np.ndarray) – DCT coefficients, of shape [num_vertical_blocks, num_horizontal_blocks, 8, 8]
calibrated (bool) – Choose JRM or cc-JRM.
qt (np.ndarray) – quantization table

Returns:

JRM features as ordered dict, where the keys are the names of the submodels. All submodels together have dimensionality 11255

Return type:

collections.OrderedDict

sealwatch.jrm.extract_from_file(path, calibrated=False)¶

Compute the JPEG rich models (JRM) feature descriptor from the given image’s luminance channel.

The mode-specific submodels give the rich model a fine “granularity” at the price of utilizing only a small portion of the DCT plane. To cover a larger range of DCT coefficients, the mode-specific submodels are complemented by co-occurrence matrices integrated over all DCT modes.

J. Kodovsky, J. Fridrich, Steganalysis of JPEG Images Using Rich Models, SPIE, Electronic Imaging, Media Watermarking, Security, and Forensics, 2012. http://dde.binghamton.edu/kodovsky/pdf/SPIE2012_Kodovsky_Steganalysis_of_JPEG_Images_Using_Rich_Models_paper.pdf

Parameters:

path – path to JPEG image
calibrated (bool) – Choose JRM or cc-JRM.

Returns:

JRM features as ordered dict, where the keys are the names of the submodels. All submodels together have dimensionality 11255

Return type:

collections.OrderedDict

sealwatch.ccjrm.extract(y1, *, qt=None)¶

Extracts calibrated JPEG rich models (JRM) for the given DCT coefficients.

Parameters:

y1 (np.ndarray) – DCT coefficients, of shape [num_vertical_blocks, num_horizontal_blocks, 8, 8]
qt (np.ndarray) – quantization table

Returns:

cc-JRM features as ordered dict, where the keys are the names of the submodels. All submodels together have dimensionality 22510

Return type:

collections.OrderedDict

sealwatch.ccjrm.extract_from_file(path)¶

Compute the clibrated JPEG rich models (cc-JRM) feature descriptor from the given image’s luminance channel.

The mode-specific submodels give the rich model a fine “granularity” at the price of utilizing only a small portion of the DCT plane. To cover a larger range of DCT coefficients, the mode-specific submodels are complemented by co-occurrence matrices integrated over all DCT modes.

J. Kodovsky, J. Fridrich, Steganalysis of JPEG Images Using Rich Models, Proc. SPIE, Electronic Imaging, Media Watermarking, Security, and Forensics XIV, San Francisco, CA, January 23–25, 2012. http://dde.binghamton.edu/kodovsky/pdf/SPIE2012_Kodovsky_Steganalysis_of_JPEG_Images_Using_Rich_Models_paper.pdf

Parameters:: path – path to JPEG image
Returns:: cc-JRM features as ordered dict, where the keys are the names of the submodels. All submodels together have dimensionality 22510
Return type:: collections.OrderedDict

DCTR ¶

sealwatch.dctr.extract(x1, q, *, T=4)¶

Extracts DCTR features from the provided image.

Note that there can be minor differences during quantization, which is why the Matlab and Python results do not match perfectly.

Parameters:

x1 – grayscale image with intensities in range [-128, 127]
q (float) – quantization step
T – truncation threshold. The number of histogram bins is T + 1.

Returns:

DCTR features of shape [64x25, 5]

Return type:

Example:

>>> # TODO

sealwatch.dctr.extract_from_file(path, qf)¶

Extract DCTR features from the luminance channel of JPEG image given by its filepath

Parameters:

path (str) – path to JPEG image
qf – JPEG quality factor used to determine the quantization step

Returns:

DCTR features of shape [64x25, 5]

Return type:

PHARM ¶

sealwatch.pharm.extract(x1, *, implementation=Implementation.PHARM_REVISITED, q=5, T=2, num_projections=100, maximum_projection_size=8, first_order_residuals=True, second_order_residuals=True, third_order_residuals=True, symmetrize=True, normalize=False, seed=1)¶

Extracts the PHARM features from a given decompressed image.

The PHARM features were introduced in V. Holub and J. Fridrich, Phase-Aware Projection Model for Steganalysis of JPEG Images. SPIE Electronic Imaging, Media Watermarking, Security, and Forensics XVII, vol. 9409, 2015. http://dde.binghamton.edu/vholub/pdf/SPIE15_Phase-Aware_Projection_Model_for_Steganalysis_of_JPEG_Images.pdf

Parameters:

x1 (np.ndarray) – decompressed JPEG image of shape [height, width]
implementation (Implementation) – implementation of PHARM to use
q (int) – quantization step
T (int) – truncation threshold
num_projections (int) – number of random projection matrices. The original implementation defaults to 900, but we use 100 for speed reasons.
maximum_projection_size (int) – maximum spatial size of each projection matrix
first_order_residuals (bool) – If True, include first order residuals. If False, skip first order residuals.
second_order_residuals (bool) – If True, include second order residuals. If False, skip second order residuals.
third_order_residuals (bool) – If True, include third order residuals. If False, skip third order residuals.
symmetrize (bool) – If True, merge histograms with horizontally and vertically flipped versions of the image. If False, skip symmetrization.
normalize (bool) – If True, normalize the histogram counts.
seed (int) – seed for random number generator for the projection matrices

Returns:

features as ordered dictionary, where the keys are the submodel names and the values are the features of shape [num_projections, T]. Note that the features are not normalized.

Return type:

OrderedDict

Example:

>>> # TODO

sealwatch.pharm.extract_from_file(path, *, implementation=Implementation.PHARM_REVISITED, q=5, T=2, num_projections=100, maximum_projection_size=8, first_order_residuals=True, second_order_residuals=True, third_order_residuals=True, symmetrize=True, normalize=False, seed=1)¶

Extracts the PHARM features from a given JPEG image.

The PHARM features were introduced in V. Holub and J. Fridrich, Phase-Aware Projection Model for Steganalysis of JPEG Images. SPIE Electronic Imaging, Media Watermarking, Security, and Forensics XVII, vol. 9409, 2015. http://dde.binghamton.edu/vholub/pdf/SPIE15_Phase-Aware_Projection_Model_for_Steganalysis_of_JPEG_Images.pdf

Parameters:

path (str or Path) – path to JPEG image
implementation (Implementation) – implementation of PHARM to use
q (int) – quantization step
T (int) – truncation threshold
num_projections (int) – number of random projection matrices. The original implementation defaults to 900, but we use 100 for speed reasons.
maximum_projection_size (int) – maximum spatial size of each projection matrix
first_order_residuals (bool) – whether to include first order residuals
second_order_residuals (bool) – whether to include second order residuals
third_order_residuals (bool) – whether to include third order residuals
symmetrize (bool) – whether to merge histograms with horizontally and vertically flipped image. If False, skip symmetrization.
normalize (bool) – whether to normalize the histogram counts, by default False
seed (int) – seed for random number generator for the projection matrices, by default 1

Returns:

features as ordered dictionary, where the keys are the submodel names and the values are the features of shape [num_projections, T]. Note that the features are not normalized.

Return type:

OrderedDict

Example:

>>> # TODO

class sealwatch.pharm.Implementation(value)¶

PHARM implementation to choose from.

PHARM_ORIGINAL = 1¶: Original PHARM implementation by DDE.

PHARM_REVISITED = 2¶: PHARM implementation with fixes.

GFR ¶

sealwatch.gfr.extract(img, *, num_rotations=32, quantization_steps=75, T=4, implementation=Implementation.GFR_ORIGINAL)¶

Extract the Gabor filter residual features from a given image.

Parameters:

img – grayscale image with values in range [0, 255]
num_rotations (int) – number of rotations for Gabor kernel
quantization_steps (int) – quantization step for each of the four scales
T (int) – the highest histogram bin value after quantization. The histogram contains T + 1 bins corresponding to the values [0, …, T]. Quantized values exceeding T will be clamped to T.
implementation (Implementation) –

Returns:

extracted Gabor features as 5D ndarray. The five dimensions denote: # Dimension 0: Phase shifts # Dimension 1: Scales # Dimension 2: Rotations/Orientations # Dimension 3: Number of histograms # Dimension 4: Co-occurrences

Flatten the 5D array to obtain a 1D feature descriptor.

Will be changed in the future to OrderedDict to match the common interface.

Return type:

Example:

>>> # TODO

sealwatch.gfr.extract_from_file(path, num_rotations=32, qf=None, quantization_steps=None, T=4, implementation=Implementation.GFR_ORIGINAL)¶

Extract the Gabor filter residual features from a given JPEG image file.

Parameters:

path (Union[Path, str]) – filepath to JPEG image
num_rotations (int) – number of rotations for Gabor kernel
qf (Optional[int]) – JPEG quality factor; used to select the quantization steps
quantization_steps (Optional[int]) – list of four quantization steps, one for each scale
T (int) – truncation threshold
implementation (Implementation) –

Returns:

extracted Gabor features as 5D ndarray. The five dimensions denote: # Dimension 0: Phase shifts # Dimension 1: Scales # Dimension 2: Rotations/Orientations # Dimension 3: Number of histograms # Dimension 4: Co-occurrences

Flatten the 5D array to obtain a 1D feature descriptor.

Return type:

class sealwatch.gfr.Implementation(value)¶

GFR implementation to choose from.

GFR_FIX = 2¶: GFR implementation with fixes.

GFR_ORIGINAL = 1¶: Original GFR implementation by DDE.

Detectors ¶

class sealwatch.ensemble_classifier.EnsembleClassifier(base_learners, d_sub=None)¶

predict(X)¶: Calculate predictions based on (unweighted) majority voting. Ties are resolved randomly. :param X: samples of shape [num_samples, num_features] :return: predictions of shape [num_samples], where -1 stands for the negative and +1 for the positive class

predict_confidence(X)¶: Calculate confidence score based on majority voting. :param X: samples of shape [num_samples, num_features] :return: confidence score of predictions of shape [num_samples], in the range of -1 for the negative and +1 for the positive class.

score(X, y_true)¶: Calculate accuracy :param X: samples of shape [num_samples, num_features] :param y_true: labels of shape [num_samples], where -1 indicates a cover image and +1 indicates a stego image :return: accuracy

class sealwatch.ensemble_classifier.FldEnsembleTrainer(Xc, Xs, seed=None, seed_subspaces=None, seed_bootstrap=None, L='automatic', d_sub='automatic', verbose=1, max_num_base_learners=500)¶

train()¶: Train an ensemble of Fisher linear discriminant classifiers :return: (ensemble_classifier, training_records) as 2-tuple ensemble_classifier is an instance of an EnsembleClassifier training_records is a list of dicts

class sealwatch.xunet.XuNet¶

XuNet: A convolutional neural network for steganalysis.

Architecture: - Preprocessing with high pass filter - 5 convolutional groups with batch normalization, activation, and pooling - Fully connected layers with softmax output

Intended for binary classification of stego and cover images.

forward(x)¶

Forward pass

Parameters:: x (Tensor) –
Return type:: Tensor

sealwatch.xunet.pretrained(model_path=None, model_name='XuNet-LSBM_0.4_lsb-250714133836.pt', *, device=device(type='cpu'), strict=True)¶

Loads pretrained model. Downloads if missing.

Parameters:

model_path (str) – local path to the model
model_name (str) – filename of the model
device (torch.nn.Module) – torch device
strict (bool) –

Returns:

loaded XuNet Model

Return type:

torch.nn.Module

sealwatch.xunet.infere_single(x, model=None, *, device=device(type='cpu'))¶

Runs inference for a single image.

Parameters:

x – image
model –
device –

Returns:

Return type:

Helper functions ¶

sealwatch.ensemble_classifier.helpers.load_hdf5(features_filename, max_num_samples=None)¶

Retrieve features and filenames from a HDF5 file.

When the origin attribute is “matlab”, the feature array is transposed.

Parameters:

features_filename – path to HDF5 file
max_num_samples – Only load the first n features and filenames

Returns:

(features, filenames) as 2-tuple

sealwatch.ensemble_classifier.helpers.load_features(cover_features_filename, stego_features_filename, max_num_samples=None)¶

Load cover and stego features.

On the way, drop images where the feature extraction failed. Also drop images where we have no matching cover-stego pairs.

Parameters:

cover_features_filename – path to HDF5 file containing the cover features
stego_features_filename – path to HDF5 file containing the stego features
max_num_samples – take only the first n samples from each dataset. Useful for quick prototyping.

Returns:

(cover_features, stego_features, cover_filenames, stego_filenames) cover_features and stego_features are ndarrays of shape [num_samples, num_features] cover_filenames and stego_filenames are lists with strings

sealwatch.ensemble_classifier.helpers.remove_file_extension(f)¶

sealwatch.ensemble_classifier.helpers.load_and_split_features(cover_features_filename, stego_features_filename, train_csv, test_csv, max_num_samples=None)¶

Load cover and stego features and split them into training and test sets.

On the way, drop images where the feature extraction failed. Also drop images where we have no matching cover-stego pairs.

Parameters:

cover_features_filename – path to HDF5 file containing the cover features
stego_features_filename – path to HDF5 file containing the stego features
train_csv – csv file containing the filenames to use for training
test_csv – csv file containing the filenames to use for testing
max_num_samples – take only the first n samples from each dataset. Useful for quick prototyping.

Returns:

6-tuple 0: cover_features_train, 1: stego_features_train, 2: cover_features_test, 3: stego_features_test 4: train_filenames (same lengths as covers; the covers and stegos have the same filenames) 5: test_filenames (same length as covers; the covers and stegos have the same filename)

sealwatch.ensemble_classifier.helpers.load_features_subset(cover_features_filename, stego_features_filename, test_csv)¶

Reference¶

Analytical attacks ¶

chi2 attack ¶

SPA ¶

WS ¶

Histogram attack ¶

RJCA ¶

Handcrafted features ¶

HCF-COM ¶

SPAM ¶

SRM ¶

CRM ¶

JRM ¶

DCTR ¶

PHARM ¶

GFR ¶

Detectors ¶

Helper functions ¶

Navigation

Related Topics