Metadata-Version: 2.1
Name: abito
Version: 0.1.2
Summary: Package for hypothesis testing in A/B-experiments
Home-page: https://github.com/avito-tech/abito
Author: Danila Lenkov
Author-email: dlenkoff@gmail.com
License: MIT
Description: # abito
        [![Build Status](https://travis-ci.com/avito-tech/abito.svg?branch=master)](https://travis-ci.com/avito-tech/abito)
        [![Coverage Status](https://coveralls.io/repos/github/avito-tech/abito/badge.svg?branch=master)](https://coveralls.io/github/avito-tech/abito?branch=master)
        
        Python package for hypothesis testing. Suitable for using in A/B-testing software.
        Tested for Python >= 3.5. Based on numpy and scipy.
        
        ##### Features
        1. Convenient interface to run significance tests.
        2. Support of ratio-samples. Linearization included (delta-method).
        3. Bootstrapping: can measure significance of any statistic, even quantiles. Multiprocessing is supported.
        4. Ntile-bucketing: compress samples to get better performance.
        5. Trim: get rid of heavy tails.
        
        
        ## Installation
        ```
        pip install abito
        ```
        
        ## Usage
        
        The most powerful tool in this package is the Sample:
        ```python
        import abito as ab
        ```
        
        Let's draw some observations from Poisson distribution and initiate Sample instance from them.
        ```python
        import numpy as np
        
        observations = np.random.poisson(1, size=10**6)
        sample = ab.sample(observations)
        ```
        
        Now we can calculate any statistic in numpy-way.
        ```python
        print(sample.mean())
        print(sample.std())
        print(sample.quantile(q=[0.05, 0.95]))
        ```
        
        To compare with other sample we can use t_test or mann_whitney_u_test:
        ```python
        observations_control = np.random.poisson(1.005, size=10**6)
        sample_control = Sample(observations_control)
        
        print(sample.t_test(sample_control))
        print(sample.mann_whitney_u_test(sample_control))
        ```
        
        ### Bootstrap
        Or we can use bootstrap to compare any statistic:
        ```python
        sample.bootstrap_test(sample_control, stat='mean', n_iters=100)
        ```
        
        To improve performance, it's better to provide observations in weighted form: unique values + counts. Or, we can compress samples, using built-in method:
        ```python
        sample.reweigh(inplace=True)
        sample_control.reweigh(inplace=True)
        sample.bootstrap_test(sample_control, stat='mean', n_iters=10000)
        ```
        Now bootstrap is working lightning-fast. To improve performance further you can set parameter n_threads > 1 to run bootstrapping using multiprocessing.
        
        ### Compress
        ```python
        observations = np.random.normal(100, size=10**8)
        sample = ab.sample(observations)
        
        compressed = sample.compress(n_buckets=100, stat='mean')
        
        %timeit sample.std()
        %timeit compressed.std()
        ```
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Topic :: Scientific/Engineering
Description-Content-Type: text/markdown
