Linux personalization

I like to customize my Linux environment as follows (applies to Ubuntu distros):

* Set-up logging display on tty9-tty12:
Add to /etc/rc.local:

tail -f /var/log/syslog >/dev/tty9 &
tail -f /var/log/auth.log >/dev/tty10 &
# Display tcp dump on tty11
tcpdump >/dev/tty11
# Start top on tty12 (need to declare TERM b/c it is undeclared
# when rc.local is run in the boot process
# Running top on tty12 is a security risk as any non-logged in
# user can kill processes.
export TERM=linux
sudo openvt -c 12 top &

* Set-up bash prompt to display current time
NOTE: This mod allows one to easily tell how long a process took to complete.

Add to the PS1 declaration in .bashrc for the current user the following “\D{%F %T}”. Such that the new PS1 becomes:

PS1=’${debian_chroot:+($debian_chroot)}\u@\h:\w \D{%F %T}\n\$ ‘

There’s two PS1 declarations add it to both.

The full list of prompt options is detailed in the man bash, at the section PROMPTING.

Glossary for “Structures – or why things don’t fall down” by J.E. Gordon

A biscuit is stiff but weak, steel is stiff but strong, nylon is flexible (low E) and strong, raspberry jelly is flexible (low E)

  • stiffness:
  • (p 50) the slope of a stress-strain diagram measures the elastic stiffness or floppiness of a given solid. This is also known as Young’s modulus of elasticity or “elastic modulus”.

    Stiffness = E = stress / strain [SI: MN/m^2]

    The stiffness of a material is the stress for which the length of the material would double (if the material doesn’t break by then).

    Stiff (high E) vs Flexible (low E)

  • strain:
  • (p 49) How far the atoms at any point in a solid are being pulled apart – that is by what proportion the bonds between the atoms are stretched.

    Strain = (increase of length) / (original length) [no measurement units – sometimes it is conveyed as %]

  • strength
  • (p 56) The stress required to break a piece of the material itself. A lot of times we are concerned with the “ultimate tensile stress” which is the “tensile strength” of a material determined by breaking a small test-piece in a testing machine.

    Strong (withstands high stress) vs Weak (breaks at low stress)

  • stress
  • (p 46) A measure of how hard the atoms in a material are being pushed together or pulled apart as a result of external forces. Unlike the concept of pressure stress in a solid is often a directional or one-dimensional affair (for most discussions).

    Stress = (load or force applied) / (cross-sectional area) [SI: MN/ m^2]

    1 Meganewton ~ 100 tons force

    Server Specs for small bioinformatics lab

    There seems to be very little information on how to build a small workhorse computer for running bioniformatics pipelines on the internets. Most of the information is focused on big expensive Next Gen Sequencing centers and there is none for small university labs or companies.

    Here’s my contribution to this lack of data.

    Disclaimer – this post does not address the storage solution for bioinformatics labs. For large data sets this is a non trivial solution though for smaller data sets (under 50TB) getting an 8 bay NAS with 8x 6TB HDD (or even 8TB HDD) might be a decent start.

    Now regarding the work horse computer build.

    The architecture I chose is based around high-end gaming Intel architecture – as this is what I find I have moderate experience with.

    I propose two solutions that differ only in the CPU model used. If you find you can parallelize a lot of the pipeline the more expensive architecture is going to be probably faster at a pretty high cost.

    Also all component prices are in $CAD and are based on NCIX.com list price.

    Without further ado:

    Component Model Price Notes
    Motherboard EVGA X99 Classified E-ATX X99 $529.99 8x DDR4 slots (128GB); M.2 connectors; 10x SATA interfaces; socket supports both Xenon and i7 CPU.
    CPU (cheap) Intel Core I7-6800K BROADWELL-E

    $589.99 Marginally more expensive than the i7 4 core Skylake. But has 2x L3 cache and 6 cores.
    CPU (expensive) Intel CPU BX80660E52680V4 Xeon E5-2680V4 $2,523.40 14 core (28 threads) with 3.5MB L2 and 35MB L3 (about 4x i7 CPU at about 5x price)
    RAM 128GiB 4x Corsair Vengeance Lpx 32GB 2X16GB DDR4 3000MHZ $999.92 Maxed out the MB – the more the better.
    SSD Samsung 950 Pro 512GB M.2 2280 NVMe $429.99 Place holder until the Samsung 960Pro is released later in the year.
    HDD 2x Western Digital WD60EFRX 6TB Red $639.98 In RAID 0 for increased data transfer.
    Power Supply EVGA SUPERNOVA 750 G2 80 PLUS $134.99 750W is OKish for this MB (more would be better)
    Case Phanteks Enthoo Pro Full Tower EATX $129.99 nothing fancy

    Total price (cheap): $3,454.85 (before tax) [~3,900 after tax]

    Total price (expensive): $5,388.26 (before tax) [~6,100 after tax]

    Note: You’ll also need to buy a video card – I will use one that I have lying around so I didn’t include it in the price. As I don’t think the bioinformatics tools are GPU optimized the graphics card can be as cheap as possible. Also there might be some cables you need to buy.

    A few other considerations:

  • Instead of using a single powerful computer you can build a computing cluster. The major issue with this is that someone needs to administer the cluster. The personnel is a recurring cost that completely dominates the cost of the cluster itself in the long run.
  • Alternatively you could look into DNA Nexus for a cloud alternative for your computing needs.
  • Poor man’s Bose – Open Office and On The Go Noise Control

    This is a note that I will point people to when they ask me about my solution to the problem of noisy office environments and also transit noise (train/subway/bus).

    TLDR:

  • ($46) For a cheap alternative to Bose noise canceling headphones for office noise control get the Leight Sync earmuffs and the MPOW bluetooth receiver and listen to brown noise at Simply Noise.
    NOTE: For high end sound and more money get the UltraPhones
  • ($60) For on the go noise control (bus, train, airplane) get a pair of Leight L0F portable earmuffs to use with the LG HBS-750 bluetooth earbuds
  • —–

    When I started working at the one and only startup I’ve been involved with everyone was young and eager to try the latest and greatest ideas to make things work as effectively as possible. One of the ideas we adopted was the open plan office. This is the plan where all the employees desks are sitting next to each other so everyone can see and hear everyone else which improves communication and strengthens teamwork. Of course I read Peoplewear and was aware of how difficult it will be to stay in the zone in such an environment but in the great scheme of trade-offs the open office plan came ahead. Did I mention I was the lead software developer at the company and had to be able to stay in the zone for solid chunks of time to deliver the promised code. Oh well… how did I deal with it?

    First everyone went through the standard headphones plus music stage. It didn’t work very well for me. Music after a while gets tiring – especially when I’m in my 8th hour of listening through emotionally taxing soundtracks like Two Steps from Hell. It also needs to be cranked up pretty high to overpower the people talking around me – and that damages my hearing. As I like my hearing to be “not damaged” – no music for me. Then a couple of us went through the white noise generator stage – which I would say was somewhat successful but also needed to be quite loud to drown out surrounding conversations.

    Then there was the white noise generator plus bandana stage. That didn’t improve on the solution by much. The bandana was used as a signal – every time someone had it on we would try to keep quiet as we knew they need to stay focused and work hard. Then came the Bose noise-canceling headphones stage.

    Let me tell you about Bose headphones – I have two pairs of their QC20s. I’ve bought my first pair in March 2009 and the second about 3 years ago – (it’s for my SO to play the electronic piano in the middle of the night). Bose are great headphones for looking like you are an audiophile. As far as I can tell – from talking to friends who are recording music in studios – nobody uses them professionally. The first pair I bought was $270 and as far as their “noise-canceling” technology goes they need to increase the canceling to 11. They do a pretty good job with steady constant droning noises (like the HVAC fans) but any higher frequency sound – like people talking – is left uncancelled. On top of their lack of cancelling in the frequency band that matters they are designed for obsolescence (i.e. they break apart within a few months of daily use) so you need to purchase expensive replacement parts for them. I’ve replaced the ear cups on them twice so far and for the last pair I’ve been very careful not to damage them. The problem is they come undone even if I barely use them and at $30 a pair they are expensive!!! When I noticed the head band is starting to peel off I realized these headphones are not for me. All in all the Bose are maybe worth 1/4 the money they charge for them. Though there was a silver lining that hopefully made my Bose experience worth it – I noticed that most of their “noise-cancelling” was actually passive and consisted of good noise suppression from the good seal created by the headphone ear cups.

    r_img_5067

    So with the expensive lesson learned from my Bose affair I started down on the path of finding something else that was good at suppressing sound passively. I got lucky that one of my coworkers – who had previously worked on construction sites – lent me his earmuffs that I noticed he wore in our open office. They were rugged and cheap and after trying them out I considered them a solid starting point for my next attempt at office noise control. I got myself a pair of Leightning L3s for $30 and combined with some ear plugs I found myself in possession of an amazing solution at blocking out noise. But this solution was not feasible for office life as it took forever to take off and put on my noise-cancelling contraption every time someone needed to talk to me. What I wanted to get was a pair of L3s with headphones in them so I can play white noise at low volume to drown out conversations and not need to use the ear plugs.

    Sometimes in the last couple of years I noticed that Leight was offering their Leight Sync earmuffs. These were very close to what I needed. I purchased a pair and yes combined with a white noise generator they were very close to a final solution for my office noise problems. Though they did have trait I disliked – they needed a wire that would connect them to my computer. I cannot count how many times I rolled my chair away from my desk to test some hardware to have my headphone wire drag me back like a dog in a leash. And then I found the MPOW bluetooth receiver – which transformed the Leight Sync into wireless noise cancelling headphones and made my final solution complete. The final sticker price for this wireless office noise control solution I use and promote to others to adopt comes to about $46.

    r_img_5068

    r_img_5070

    A note on noise generators: There are many white noise generators out there. I use the very popular Simply Noise. And even though I say I listen to white noise I find that brown noise to be somewhat more effective at silencing voices around me. I think this is because brown noise focuses most of its power over the lower frequencies – where the ear muffs have the hardest time suppressing sound.

    earmuffattenuationplot

    Also a few notes on earmuffs:

    NOTE 1: The first two weeks of wearing earmuffs they seemed to press on my head in a strange way that made my ears, jaw and head hurt. After a while of me stretching the earmuffs and my head getting used to them they became much more comfortable.

    NOTE 2: Depending on the shape of your head the head band of the earmuffs can make the crown of your head hurt – though this is not limited to earmuffs as may headphones have this undesirable feature.

    NOTE 3: Howard Leight earmuffs seem to be of two types the Honeywell and the Sperian – I guess Leight licensed their products to both of these companies to produce. I’ve only had a couple from both – so I might not have a statistically significant N here – but it seems to me Sperian is slightly better quality than Honeywell.

    NOTE 4: If you would like a high end audiophile solution to replace the Leight Sync check out the UltraPhones – which combine the same 29db hearing protection with the SONY 7506 Professional Studio Monitor Headphones).

    At around the same time as getting the Leightning L3s I got a pair of Leight L0F portable earmuffs to use with my LG HBS-750 bluetooth earbuds while traveling on noisy buses and trains (price for the combo was around $60 in Nov 2012). The L0Fs and LG750 have been for the last 4 years my go to audio solution when traveling anywhere (be it commuting to work or going around the world on an airplane). I’m no big audiophile – I just want to listen to audiobooks and news articles while protecting my hearing. Before getting the L0Fs I had to crank the volume up to 90% which in the long term was going to damage my ears. After adding the earmuffs the volume on the headphones stays below 50%. The one problem I have with the earbuds is that if I wear them for more than a few hours my ears kind of hurt – so this combo solution doesn’t work for my long stints at the office.

    r_img_5071

    My Engineering Book Selection

    I was unfortunate not to have any personal mentors to teach me the more practical engineering skills needed while working in a startup. Though I was fortunate enough to get mentored by the ghosts of a few bright engineers – through their books. Here is a constantly evolving list of books that I suggest any engineer or technically inclined person in today’s rapidly evolving world read (none of them is too mathematically heavy).

    Deploying a PyQt app using py2exe

    For future reference this exercise was done on Windows 8.1 with Python(x,y) and MS C++ 2008 Redistributable Package installed (though py2exe still couldn’t find MSVCR90.DLL):

    In the same folder have three files like in the TestPyQtGUIApp archive.

    The MyGUIApp.py contents is:

    """
    Created on Sat Mar 05 11:37:02 2016
    
    @author: e1z
    """
    
    import sys
    
    from PyQt4 import QtCore, QtGui, uic
    from PyQt4.QtCore import SIGNAL, SLOT
    
    class MyGUIApp(QtGui.QApplication):
      def __init__(self):
        QtGui.QApplication.__init__(self, sys.argv)
        self.connect(self, SIGNAL("lastWindowClosed()"), self, SLOT("quit()"))
        self.win = uic.loadUi("myAppUI.ui")
        self.win.show()
        
    if __name__ == "__main__":
      MyGUIApp().exec_()
    

    The setup.py contents is:

    #!/usr/bin/env python
    
    from distutils.core import setup
    import py2exe
    
    setup(windows=[{"script": 'MyGUIApp.py'}],
          options={
              "py2exe": {
                  "bundle_files": 1,
                  "compressed": True,
                  "dll_excludes": ["MSVCP90.dll"],
                  "includes": ["sip"],
                  }
              }  
         )
    

    Open a command prompt in the folder and run:

    python setup.py py2exe

    The py2exe scripts are going to take over and quite verbosely indicate what they are doing. Once the scripts are done cd into the dist subfolder and run the .exe file (the GUI does nothing other than show it is running).

    If you get an error message about invalid syntax in C:\Python27\Lib\site-packages\PyQt4\uic\port_v3 go to the folder where port_v3 is located and rename the __init__.py file to OBSOLETE__init__.py. The port_v3 is for Python3 scripts … and some are questioning its inclusion into Python2 packages.

    If you are using matplotlib packages you will need to add matplotlib specific snippets to the setup.py file as seen here.

    References:

    DO YOU GIT IT?

    I touched on the world of github.com a few months ago and I realized I needed a cheat sheet to navigate it. Here it is for future reference.


    |------------------------
    | git clone
    | copies files into local folder
    |----------------
    | git status
    | shows difference between local package and github repository
    |------------------------
    | git push
    | pushes all commits to the github repository
    | git pull
    | pulls all commits from the github repository
    |------------------------
    | git commit -m "message"
    | -m adds a message to the commit
    | (if -m is not used vim/nano opens and a message can be added)
    | git add -A
    | adds all files
    | git add .
    | adds all files in the current directory
    |------------------------
    | git branch
    | shows list of branches
    | git branch newBranch
    | creates a new branch of the current branch
    | git checkout newBranch
    | makes newBranch the active branch
    |------------------------
    | git checkout newBranch
    | makes newBranch the active branch
    | git merge master
    | attempts to merge master into newBranch
    | (it always attempts to merge into the current branch)
    |------------------------
    | git log origin/master..HEAD
    | shows unpushed commits
    |------------------------
    | git diff origin/master..HEAD
    | shows unpushed diffs
    |------------------------
    | git reset --hard
    | revert changes to modified files.
    | git clean -fd
    | remove all untracked files and directories.
    |------------------------
    | git stash
    | saves current changes in a "stash" (without committing them)
    | git stash list
    | lists available stashes in a branch
    | git stash apply stash@{0}
    | reapply changes from stash@{0}
    | git stash drop stash@{0}
    | drops stash@{0}
    |------------------------
    | git log
    | show the commit history
    | git log -p -2
    | show the differences introduced in the last 2 commits
    |------------------------

    Standard workflow consists of:

    git pull
    git branch
    [implement feature]
    git add
    git commit
    git push
    [then pull request and merge]

    That’s about it.

    STRAY THOUGHTS ON EMBEDDED DEVELOPMENT

    I’ve learned from the wonderful book Embedded Software Primer by David Simon that there are – generally speaking – 3 classes of embedded system architectures:

    1) Infinite loop with round-robin calls
    2) Infinite loop with round-robin with interrupts.
    3) Function-queue-scheduling
    3) RTOS

    When starting a project attempt to identify which of the above best fits your requirements.


    Embedded development if focused on speed. One optimization I found for the von Neuman architecture that decreases latency when doing signal processing is to store calibration values that are used often in RAM. This is not only faster than dereferencing pointers to FLASH memory but it also keeps the bus clear for code retrieval.


    An important interface between the electrical engineer and embedded software developer is a a table (updated regularly) containing all DAQ inputs into the uController. This table should have the following columns:
    – uController pin #
    – uController port and line (if available)
    – type of pin (DI, DO, AI, AO, other)
    – a verbose description

    The description should contain the mapping of the measured/controlled parameter if AI/AO for example:
    [0V; 3.3V] -> [0ADC-4096ADC] (12bit) -> [-120mA; 120mA]
    for a current sensing AI pin.

    Also speaking of DAQ I find it very useful to have a header file that contains only the definition of all the uController pins.


    As embedded hardware are often very close to the physical world I strongly recommend including measurement units as postfixes for all variables that can be measured (mA; A; kPa etc)


    For all calibration parameters keep a table in the design spec with the following columns:
    – max, min values
    – default value
    – minimum increment value.

    This is valuable during validation.


    Here is a list of “tricky” issues that I encountered while dealing with data acquisition development:

    ADC ghosting
    Many AI ports are multiplexed over the same ADC module. So when iterating over reading multiple inputs previous readings introduce a bias in the current reading.

    One way in which I dealt with this was to intersperse the reading of an AI that was set to GND for every other AI reading.

    ADC conversion current
    When an analog signal is converted into a digital reading a small (yet significant) amount of current is pulled into the uController pin. If the electrical design has any resistances in series with the analog input pin the voltage read on the pin will be offset by the voltage drop developed due to the conversion current.

    This is more of a electrical design issue but ensure that all inputs in the uController are properly buffered before reaching the uController.

    Aliasing
    There’s a few ways to determine if aliasing occurs. One easy way is to disable and enable the ADC clock at random intervals (cannot be multiple of clock frequency) and observe if the ADC value changes. If it does most likely aliasing is occurring. As far as I can tell aliasing cannot be addressed in software. A low pass filter needs to be added as close as possible to the ADC line into the uController pin.

    STRAY THOUGHTS ON APP DEVELOPMENT

    I found the following in an e-mail I sent to a friend. It contains my raw thoughts about what to consider when developing an app. I brushed it up here and there and then posted it for my future reference.


    Identify what the application needs to do (start with user requirements, derive functional specifications and maybe some test cases to gauge if you satisfy the user requirements).

    Define major modules (i.e. database interface, other hardware the app touches, com protocols, state machines used, error handling, data logging, are u using a model view controller pattern for the ui? etc)

    Define data model (i.e. what the tables in a relational database are, superkeys, XML trees, whatever u use to store data, also include some kind of versioning system for your data model… It’s going to come in handy later on.)

    More recently this step has changed somewhat with the advances of the noSQL databases. It is still very important to take a first shot at defining a schema for your database.

    Define “business logic” (start from the inputs and outputs of the system and work inwards, do u have any constraints on the data manipulation, once defined what are the constraints the system imposes on the users)

    Design UI (umm… I guess there’s way too much to talk about here)

    Somewhere in between UI and data model there should be a well defined IO Protocol (i.e. this kind of data comes from the user and this comes from the database. Then the user can only see this kind of data in this particular format)

    Resource management (how fast do you want you app to respond to UI input? Maybe u need to use some threads to make it feel smooth – if that’s the case try to define what must stay on the main thread and what can be spun off in secondary threads, if you are on an embedded architecture start thinking about interrupt priority, how big can the stack be, are you running out of RAM or FLASH? Even on a desktop u probably don’t wanna load a 1GiB data set and play with it in RAM.)

    Security/networking (is this accessible over the intertubes? Can the data set be compromised by some malefic spirit? More networking related – what happens if the internets are not available – do you have elegant degradation of service or does the app blow in your face?)

    Error processing and failure recovery. Where do u deal with the errors? Do you have a singleton that you pump errors to and that then becomes the decision nexus for what needs to happen next (I don’t like this as it becomes a highly coupled element in the design – but sometimes it is needed) or do u deal with errors locally (this sometimes still requires a global Error object that stores all kind of error flags that needs to be checked upon by different modules of the app). Early on my UI didn’t have consistent error conventions (i.e. a well defined error code structure) – and that made me feel kind of silly. Make sure you know which errors you detect and report without doing anything versus detect and try to correct. Are there any errors you can foresee (ie user input that’s contradictory)?

    Scalability (aha… This bit me quiet hard…) This is where you wanna make sure you decouple modules and have nice defined interfaces. Keep your data models independent from your data manipulation code. Use polymorphism to deal with different versions. Which modules are most likely to change? Throughout all this process remember that decoupling shifts the complexity from local to global.

    I’m kind of done… This is raw brain dump and is all typed on iPhone so pardon the spelling and coherency.

    I would probably just come up with a design and then subject it to an analysis that focuses on the areas above. Then implement some of it and repeat until satisfied.

    PHASE LOCK THEORY

    Here I will present the notes I made while developing the phase lock algorithm used to detect trace amounts of DNA orbiting an electrical focal point.

    It took me some time to wrap my mind intimately enough around the phase lock concept to be able to code it simply into an application. I am going to lay down my thoughts here for my future self to reference.

    Phase Locking allows me to hone in on a signal from a noisy source by eliminating all other components of the noisy source except for a known frequency at a known phase. I know the frequency and phase because I either generate the signal or whoever is generating it passes on this information to me. I found that what helped the most was the following mathematical presentation.

    Let:

    (1)   \begin{equation*} $S_{enc} = A_{enc} \times sin(\omega_{enc} \times t + \theta_{enc})$ \end{equation*}

    be an Encoded Signal. S_{enc} is composed of the carrier wave and the Signal of Interest I want to transmit across the medium. In this case the amplitude of S_{enc} varies and is proportional to the Signal of Interest.
    Let:

    (2)   \begin{equation*} $S_{bkgN} = \sum_{k=1}^n A_k \times sin(\omega_k \times t + \theta_k)$ \end{equation*}

    be the Background Noise that accompanies the Encoded Signal. In this case the sum indicates that the profile of the Background Noise spans many frequencies at different amplitudes and phase offsets.

    The signal that I receive and that I am going to apply the Phase Lock procedure on contains both the Encoded Signal and the Background Noise:

    (3)   \begin{equation*} \begin{split} S_T &= S_{enc} + S_{bkgN}\\ & =A_{enc} \times sin(\omega_{enc} \times t + \theta_{enc}) + \\ &\phantom{=}\, +\sum_{k=1}^n A_k \times sin(\omega_k \times t + \theta_k) \end{split} \end{equation*}

    Here is how I imagine the signal of interest, carrier wave and channel noise are combined to result in what gets received by the sensing module:PhaseLockInputV0.2

    At this point I am going to delve into a bit of signal processing. My end goal is to extract the amplitude A_{enc} from the noisy S_T. For this I will use my knowledge that the Encoded Signal S_{enc} is broadcast at a frequency \omega_{enc} with a phase offset \theta_{enc}. The procedure is the following:

    1) Multiply S_T by a reference signal S_{ref}. In this case S_{ref} has has the same frequency and phase as S_{enc}.
    2) Convolve the result by a constant function (this can also be interpreted as integrating the result).

    To understand what the two steps above do I am going to present their effects both in time and frequency domain.

    Remember: CONVOLUTION in time domain is MULTIPLICATION in frequency domain. And the other way around.

    The outcome of the first step (the S_T \times S_{ref} in the time domain) in the frequency domain is to shift the frequency \omega_{enc} of my Encoded Signal to DC (0 Hz).

    The second step (the convolution of the result of step 1 with a DC function in the time domain) results in all frequencies except DC being multiplied by 0 – practically leaving only the power component of the Encoded Signal.

    If you manage to understand the sequence of steps above the math below will make much more sense.

    Now that I have a good intuition on what I am planning on doing I’ll go through the math in time domain to make sure I get all the fine details right.

    So first up is the multiplication – which is also known as “beating” S_T with a reference signal S_{ref}.

    Let:

    (4)   \begin{equation*} S_{ref}=sin(\omega_{ref} \times t + \theta_{ref}) \end{equation*}

    Then:

    (5)   \begin{equation*} \begin{split} S_T \times S_{ref} &= [A_{enc} \times sin(\omega_{enc} \times t + \theta_{enc})] \times sin(\omega_{ref} \times t + \theta_{ref}) +\\ &\phantom{=}\, +[\sum_{k=1}^n A_k \times sin(\omega_k \times t + \theta_k)] \times sin(\omega_{enc} \times t + \theta_{ref}) \end{split} \end{equation*}

    Remembering from whatever is left of my trigonometry memories that:

    (6)   \begin{equation*} sin(u) \times sin(v) = \frac{cos(u-v) - cos(u+v)}{2} \end{equation*}

    The top term of (5) simplifies to:

    (7)   \begin{equation*} \begin{split} \frac{A_{enc}}{2} \times [cos(\omega_{enc} \times t + \theta_{enc} - \omega_{ref} \times t - \theta_{ref}) - \\ - cos(\omega_{enc} \times t + \theta_{enc} + \omega_{enc} \times t + \theta_{ref})] \end{split} \end{equation*}

    Now remember from the way I chose S_{ref} that \omega_{enc}=\omega_{ref} and that \theta_{enc}=\theta_{ref} so the first cos() becomes cos(0)=1 which is a DC (non-oscilating) term. So the top term ends up being:

    (8)   \begin{equation*} \frac{A_{enc}}{2} \times [1 - cos(2 \times ( \omega_{enc} \times t + \times \theta_{enc}))] \end{equation*}

    The bottom term of (5) becomes:

    (9)   \begin{equation*} \begin{split} \sum_{k=1}^n \frac{A_k}{2} \times [cos(\omega_k \times t + \theta_k - \omega_{ref} \times t - \theta_{ref}) - \\ - cos(\omega_k \times t + \theta_k + \omega_{ref} \times t + \theta_{ref})] \end{split} \end{equation*}

    In this case however we can’t cancel out \omega_k \times t with \omega_{ref} \times t unless \omega_k is equal to \omega_{ref} and \theta_k = \theta_{ref}. This happens for Background Noise frequencies \omega_k that are “close” to the Encoded Signal \omega_{enc}=\omega_{ref} frequency and at about the same phase offset \theta_k = \theta_{enc} = \theta_{ref}.

    Putting everything together equation (5) can then be simplified to:

    (10)   \begin{equation*} \begin{split} S_T \times S_{ref} &= \frac{A_{enc}}{2} \times [1 -\\ &\phantom{=}\, - \frac{A_{enc}}{2} \times cos(2 \times (\omega_{enc} \times t + \theta_{enc})) + \\ &\phantom{=}\, + \frac{A_k}{2} \times \sum_{k=1}^n cos[(\omega_k - \omega_{ref}) \times t + \theta_k - \theta_{ref}] - \\ &\phantom{=}\, - \sum_{k=1}^n cos[(\omega_k + \omega_{ref}) \times t + \theta_k + \theta_{ref}] \end{split} \end{equation*}

    Upon a slightly more careful inspection I observe that the result of beating S_T with S_{ref} is a DC term and a series of oscillatory terms at different frequencies.

    NOTE: Actually there’s two DC terms. The 2nd DC term is hidden in the cos[(\omega_k - \omega_{ref}) \times t + \theta_k -\theta_{ref}] term. This happens as mentioned before when \omega_k is about the same frequency as \omega_{enc}=\omega_{ref} and when \theta_k is equal to \theta_{enc} = \theta_{ref}.

    So if I were to filter the result of S_T \times S_{ref} with a DC filter I would remove all the oscillating terms. To filter using a DC term I convolve the multiplication by a constant term which leaves me with:

    (11)   \begin{equation*} [ S_T \times S_{ref} ] * 1 = \frac{A_{enc}}{2} \times 1 + \frac{A_{k0}}{2} \times 1 \end{equation*}

    where the _{k0} term is used to indicate the frequency \omega_{k0} for which \omega_{k0} = \omega_{enc} and \theta_{k0} = \theta_{enc}. That is the component of the Background Noise that has a frequency and phase equal (or very close to) the Encoded Signal’s frequency and phase.

    If we assume the Background Noise at the S_{ref} frequency and offset is small. Then:

    (12)   \begin{equation*} [S_T \times S_{ref} ] * 1 = \frac{A_{enc}}{2} \end{equation*}

    Which means that:

    (13)   \begin{equation*} A_{enc} = \frac {[S_T \times S_{ref}] *1}{2} \end{equation*}

    which brings us to the goal of finding A_{enc}.

    Thanks to the people at tex.stackexchange for teaching me how to use Latex properly.

    The biggest thanks go to The Scientist and Engineer’s Guide to Digital Signal Processing by Steven W. Smith, Ph.D for teaching me the intro knowledge on convolution.