Wotan Interface

Detrending with the flatten module

flatten removes low frequency trends in time-series data.

Parameters:
  • time (array-like) – Time values
  • flux (array-like) – Flux values for every time point
  • window_length (float) – The length of the filter window in units of time (usually days), or in cadences (for cadence-based sliders savgol and medfilt).
  • method (string, default: biweight) – Detrending method. Rime-windowed sliders: median, biweight, hodges, tau, welsch, huber, huber_psi, andrewsinewave, mean, hampel, ramsay, trim_mean, hampelfilt, winsorize. Cadence based slider: medfilt. Splines: hspline, rspline`, ``pspline. Locally weighted scatterplot smoothing: lowess. Savitzky-Golay filter: savgol. Gaussian processes: gp. Cosine Filtering with Autocorrelation Minimization: cofiam. Cosine fitting: cosine, Friedman’s Super-Smoother: supersmoother. Gaussian regressions: ridge, lasso, elasticnet.
  • break_tolerance (float, default: window_length/2) – If there are large gaps in time (larger than window_length/2), flatten will split the flux into several sub-lightcurves and apply the filter to each individually. break_tolerance must be in the same unit as time (usually days). To disable this feature, set break_tolerance to 0. If the method is supersmoother and no break_tolerance is provided, it will be taken as 1 in units of time.
  • edge_cutoff (float, default: None) – Trends near edges are less robust. Depending on the data, it may be beneficial to remove edges. The edge_cutoff defines the length (in units of time) to be cut off each edge. Default: Zero. Cut off is maximally window_length/2, as this fills the window completely. Applicable only for time-windowed sliders.
  • cval (float or int) – Tuning parameter for the robust estimators. See documentation for defaults. Larger values for make the estimate more efficient but less robust. For the super-smoother, cval determines the bass enhancement (smoothness) and can be None or in the range 0 < cval < 10. For the savgol, cval determines the (integer) polynomial order (default: 2).
  • proportiontocut (float, default: 0.1) – Fraction to cut off (or filled) of both tails of the distribution using methods trim_mean (or winsorize)
  • kernel (str, default: squared_exp) – Choice of squared_exp (squared exponential), matern, periodic, periodic_auto.
  • kernel_size (float, default: 1) – The length scale of the Gaussian Process kernel.
  • kernel_period (float) – The periodicity of the Gaussian Process kernel (in units of time). Must be provided for the kernel periodic. Can not be specified for the periodic_auto, for which it is determined automatically using a Lomb-Scargle periodogram pre-search.
  • robust (bool, default: False) – If True, the fitting process will be run iteratively. In each iteration, 2-sigma outliers from the fitted trend will be clipped until convergence. Supported by the Gaussian Process kernels squared_exp and matern, as well as cosine fitting.
  • return_trend (bool, default: False) – If True, the method will return a tuple of two elements (flattened_flux, trend_flux) where trend_flux is the removed trend.
returns:
  • flatten_flux (array-like) – Flattened flux.
  • trend_flux (array-like) – Trend in the flux. Only returned if return_trend is True.

Usage example:

import numpy as np
from astropy.io import fits

def load_file(filename):
    """Loads a TESS *spoc* FITS file and returns TIME, PDCSAP_FLUX"""
    hdu = fits.open(filename)
    time = hdu[1].data['TIME']
    flux = hdu[1].data['PDCSAP_FLUX']
    flux[flux == 0] = np.nan
    return time, flux

print('Loading TESS data from archive.stsci.edu...')
path = 'https://archive.stsci.edu/hlsps/tess-data-alerts/'
filename = "hlsp_tess-data-alerts_tess_phot_00062483237-s01_tess_v1_lc.fits"
time, flux = load_file(path + filename)

# Use wotan to detrend
from wotan import flatten
flatten_lc1, trend_lc1 = flatten(time, flux, window_length=0.75, return_trend=True, method='mean')
flatten_lc2, trend_lc2 = flatten(time, flux, window_length=0.75, return_trend=True, method='biweight')

# Plot the result
import matplotlib.pyplot as plt
plt.scatter(time, flux, s=1, color='black')
plt.plot(time, trend_lc1, linewidth=2, color='red')
plt.plot(time, trend_lc2, linewidth=2, color='blue')
plt.xlim(min(time), 1365)
plt.show()

Choosing the right window size

Shorter windows (or knot distances, smaller kernels…) remove stellar variability more effectively, but suffer a larger risk of removing the desired signal (the transit) as well. What is the right window size?

For the time-windowed sliders, the window should be 2-3 times longer than the transit duration (for details, read [the paper](www). The transit duration is

\(T_{14,{\rm max}} = (R_{\rm s}+R_{\rm p}) \left( \frac{4P}{\pi G M_{\rm s}} \right)^{1/3}\)

for a central transit on a circular orbit. If you have a prior on the stellar mass and radius, and a (perhaps maximum) planetary period, wotan offers a convenience function to calculate \(T_{14,{\rm max}}\):

Planetary transit duration assuming a central transit on a circular orbit

Parameters:
  • R_s (float) – Stellar radius (in solar radii)
  • M_s (float) – Stellar mass (in solar masses)
  • P (float) – Planetary period (in days)
  • small_planet (bool, default: False) – Small planet assumption if True (zero planetary radius). Otherwise, a planetary radius of 2 Jupiter radii is assumed (maximum reasonable planet size)
returns:

t14 (float) – Planetary transit duration (in days)

As an example, we can calculate the duration of an Earth-Sun transit:

from wotan import t14
tdur = t14(R_s=1, M_s=1, P=365, small_planet=True)
print(tdur)

This should print ~0.54 (days), or about 13 hours. To protect a transit that long, it is reasonable to choose a window size of 3x as long, or about 1.62 days. With the biweight time-windowed slider, we would detrend with these settings:

from wotan import t14, flatten
tdur = t14(R_s=1, M_s=1, P=365, small_planet=True)
flatten_lc = flatten(time, flux, window_length=3 * tdur)

Removing outliers before the detrending

Despite robust detrending methods, it is sometimes preferable to remove outliers. Wotan offers a sliding time-windowed function to do this. The user can define an upper and lower threshold, in multiples of the standard deviation or the median absolute deviation. The middle point in each window can be calculated with the mean or the median. Outliers are replaced with NaN values.

Sliding time-windowed outlier clipper.

Parameters:
  • time (array-like) – Time values
  • flux (array-like) – Flux values for every time point
  • window_length (float) – The length of the filter window in units of time (usually days)
  • low (float or int) – Lower bound factor of clipping. Default is 3.
  • high (float or int) – Lower bound factor of clipping. Default is 3.
  • method (mad (median absolute deviation; default) or std (standard deviation)) – Outliers more than low and high times the mad (or the std) from the middle point are clipped
  • center (median (default) or mean) – Method to determine the middle point
returns:

clipped (array-like) – Input array with clipped elements replaced by NaN values.

Example:

from wotan import slide_clip
clipped_flux = slide_clip(
    time,
    flux,
    window_length=0.5,
    low=3,
    high=2,
    method='mad',  # mad or std
    center='median'  # median or mean
    )

Removing outliers after the detrending

With robust detrending methods, the trend line (and thus the detrended data) may be unaffected by outliers. In the actual data, however, outliers are still present after detrending. For many purposes, it is acceptable to clip this:

from astropy.stats import sigma_clip
flux = sigma_clip(flux, sigma_upper=3, sigma_lower=20)

Masking transits during detrending

If transits have already been discovered, it is best practice to mask them while detrending. This way, the in-transit data points can not influence the detrending.

The current version supports this feature in the cosine and lowess methods. It is implemented but experimental in most other methods (give it a try…)

Example:

Example:

from wotan import transit_mask, flatten
mask = transit_mask(
    time=time_array,
    period=1.234,
    duration=0.1,
    T0=1234.123)
flatten_lc, trend_lc = flatten(
    time,
    flux,
    method='cosine',
    window_length=0.5,
    return_trend=True,
    robust=True,
    mask=mask
    )