Skip to main content


Beiträge die mit math getaggt sind

`j'ai revé que je comprenais pas que
si pour tout x inclus dans [a,b] il existe f'(x) , alors f est continu sur [a,b]

#math #fonction #derive


#Math | The Golden Ratio (why it is so irrational) - Numberphile •


[ Free Online Calculators - Math, Health, Financial, Science](

#math #top
#math #top

[ Free Online Calculators - Math, Health, Financial, Science](

#math #top
#math #top


Reciprocals of primes

Here’s an interesting little tidbit:
For any prime p except 2 and 5, the decimal expansion of 1/p
repeats with a period that divides p-1.
The period could be as large as p-1, but no larger. If it’s less than
p-1, then it’s a divisor of p-1.

Here are a few examples.

1/3 = 0.[3]{style="color: #008cff; text-decoration: underline;"}3…\
1/7 =
0.[142857]{style="color: #008cff; text-decoration: underline;"}142857…\
1/11= 0.[09]{style="color: #008cff; text-decoration: underline;"}09…\
1/13 =
0.[076923]{style="color: #008cff; text-decoration: underline;"}076923…\
1/17 =
0.[0588235294117647]{style="color: #008cff; text-decoration: underline;"}0588235294117647…\
1/19 =
0.[052631578947368421]{style="color: #008cff; text-decoration: underline;"}052631578947368421…\
1/23 =
0.[0434782608695652173913]{style="color: #008cff; text-decoration: underline;"}0434782608695652173913…

Here’s a table summarizing the periods above.
| prime | period |
| 3 | 1 |
| 7 | 6 |
| 11 | 2 |
| 13 | 6 |
| 17 | 16 |
| 19 | 18 |
| 23 | 22 |

For a proof of the claims above, and more general results, see Periods
#johndcook #Math #Numbertheory
Reciprocals of primes


Spectral sparsification

The latest episode of My Favorite

features John Urschel, former offensive lineman for the Baltimore Ravens
and current math graduate student. His favorite theorem is a result on
graph approximation: for every weighted graph, no matter how densely
connected, it is possible to find a sparse graph whose Laplacian
approximates that of the original graph. For details see Spectral
Sparsification of Graphs
by Dan
Spielman and Shang-Hua Teng.

You could view this theorem as a positive result—it’s possible to find
good sparse approximations—or a negative result—the Laplacian doesn’t
capture very well how densely a graph is connected.

Related: Blog posts based on my interview with Dan Spielman at the
2014 Heidelberg Laureate Forum.Bild/Foto{width="1"
#johndcook #Math #Networks
Spectral sparsification

John D. Cook: Spectral sparsification

Всем привет, я #новичок.Мне интересны #math и #programming-cpp.


Two-sample t-test and robustness

A two-sample t-test is intended to determine whether there’s
evidence that two samples have come from distributions with different
means. The test assumes that both samples come from normal

Robust to non-normality, not to asymmetry

It is fairly well known that the t-test is robust to departures from a
normal distribution, as long as the actual distribution is symmetric.
That is, the test works more or less as advertised as long as the
distribution is symmetric like a normal distribution, but it may not
work as expected if the distribution is asymmetric.

This post will explore the robustness of the t-test via simulation.
How far can you be from a normal distribution and still do OK? Can you
have any distribution as long as it’s symmetric? Does a little
asymmetry ruin everything? If something does go wrong, how does it go

Experiment design

For the purposes of this post, we’ll compare the null hypothesis that
two groups both have mean 100 to the alternative hypothesis that one
group has mean 100 and the other has mean 110. We’ll assume both
distributions have a standard deviation of 15. There is a variation on
the two-sample t-test that does not assume equal variance, but for
simplicity we will keep the variance the same in both groups.

We’ll do a typical design, 0.05 significance and 0.80 power. That is,
under the null hypothesis, that is if the two groups do come from the
same distribution, we expect the test to wrongly conclude that the two
distributions are different 5% of the time. And under the alternative
hypothesis, when the groups do come from distributions with means 100
and 110, we expect the test to conclude that the two means are indeed
different 80% of the time.

Under these assumptions, you’d need a sample of 36 in each group.

Python simulation code

Here’s the code we’ll use for our simulation.
from scipy.stats import norm, t, gamma, uniform, ttest_ind

num_per_group = 36

def simulate_trials(num_trials, group1, group2):
num_reject = 0
for _ in range(num_trials):
a = group1.rvs(num_per_group)
b = group2.rvs(num_per_group)
stat, pvalue = ttest_ind(a, b)
if pvalue < 0.05:
num_reject += 1

Normal simulation

.size-medium width="613" height="460"}

Let’s see how the two-sample t-test works under ideal conditions by
simulating from the normal distributions that the method assumes.

First we simulate from the null, i.e. we draw the data for both groups
from the same distribution
n1 = norm(100, 15)
n2 = norm(100, 15)
print( simulate_trials(1000, n1, n2) )

Out of 1000 experiment simulations, the method rejected the null 54
times, very close to the expected number of 50 based on 0.05

Next we simulate under the alternative, i.e. we draw data from different
n1 = norm(100, 15)
n2 = norm(110, 15)
print( simulate_trials(1000, n1, n2) )

This time the method rejected the null 804 times in 1000 simulations,
very close to the expected 80% based on the power.

Gamma simulation

.size-medium width="613" height="460"}

Next we draw data from a gamma distribution. First we draw both groups
from a gamma distribution with mean 100 and standard deviation 15. This
requires a shape parameter of 44.44 and a scale of 2.25.
g1 = gamma(a = 44.44, scale=2.25)
g2 = gamma(a = 44.44, scale=2.25)
print( simulate_trials(1000, g1, g2) )

Under the null simulation, we rejected the null 64 times out of 1000
simulations, higher than expected, but not that remarkable.

Under the alternative simulation, we pick the second gamma distribution
to have mean 110 and standard deviation 15, which requires shape 53.77
and scale 2.025.
g1 = gamma(a = 44.44, scale=2.25)
g2 = gamma(a = 53.77, scale=2.045)
print( simulate_trials(1000, g1, g2) )

We rejected the null 805 times in 1000 simulations, in line with what
you’d expect from the power.

.size-medium width="613" height="460"}

When the gamma distribution has such large shape parameters, the
distributions are approximately normal, though slightly asymmetric. To
increase the asymmetry, we’ll use a couple gamma distributions with
smaller mean but shifted over to the right. Under the null, we create
two gamma distributions with mean 10 and standard deviation 15, then
shift them to the right by 90.
g1 = gamma(a = 6.67, scale=1.5, loc=90)
g2 = gamma(a = 6.67, scale=1.5, loc=90)
print( simulate_trials(1000, g1, g2) )

Under this simulation we reject the null 56 times out of 1000
simulations, in line with what you’d expect.

For the alternative simulation, we pick the second distribution to have
a mean of 20 and standard deviation 15, and then shift it to the right
by 90 so tht it has mean 110. This distribution has quite a long tail.
g1 = gamma(a = 6.67, scale=1.5, loc=90)
g2 = gamma(a = 1.28, scale=11.25, loc=90)
print( simulate_trials(1000, g1, g2) )

This time we rejected the null 499 times in 1000 simulations. This is a
serious departure from what’s expected. Under the alternative
hypothesis, we reject the null 50% of the time rather than the 80% we’d
expect. If higher mean is better, this means that half the time you’d
fail to conclude that the better group really is better.

Uniform distribution simulation

Next we use uniform random samples from the interval [74, 126]. This
is a symmetric distribution with mean 100 and standard deviation 15.
When we draw both groups from this distribution we rejected the null 33
times, quite a bit less than the 50 times we’d expect.
u1 = uniform(74, 126)
u2 = uniform(74, 126)
print( simulate_trials(1000, u1, u2) )

If we move the second distribution over by 10, drawing from [84, 136],
we rejected the null 365 times out of 1000. That is, the t-test made
the correct decision only about 1/3 of the time rather than the 80% we’d
expect from a normal distribution.
u1 = uniform(74, 126)
u2 = uniform(84, 136)
print( simulate_trials(1000, u1, u2) )

This shows that even with symmetry, you can’t completely violate the
normal distribution assumption.

Student-t simulation

Finally we simulate from a Student-t distribution. This is a symmetric
distribution, but heavier in the tails than the normal distribution.

Bild/Foto{.aligncenter .size-medium
width="613" height="460"}

First, we simulate from a t distribution with 6 degrees of freedom and
scale 12.25, making the standard deviation 15. We shift the location of
the distribution by 100 to make the mean 100.
t1 = t(df=6, loc=100, scale=12.25)
t2 = t(df=6, loc=100, scale=12.25)
print( simulate_trials(1000, t1, t2) )

When both groups come from this distribution, we rejected the null 46
times. When we shifted the second distribution to have mean 110, we
rejected the null 777 times out of 1000.
t1 = t(df=6, loc=100, scale=12.25)
t2 = t(df=6, loc=110, scale=12.25)
print( simulate_trials(1000, t1, t2) )

In both cases, the results are in line what we’d expect given a 5%
significance level and 80% power.

Bild/Foto{.aligncenter .size-medium
width="613" height="460"}

A t distribution with 6 degrees of freedom has a heavy tail compared
to a normal. Let’s try making the tail even heavier by using 3 degrees
of freedom. We make the scale 15 to keep the standard deviation at 15.
t1 = t(df=3, loc=100, scale=15)
t2 = t(df=3, loc=100, scale=15)
print( simulate_trials(1000, t1, t2) )

When we draw both samples from the same distribution, we rejected the
null 37 times out of 1000, less than the 50 times we’d expect.
t1 = t(df=3, loc=100, scale=15)
t2 = t(df=3, loc=110, scale=15)
print( simulate_trials(1000, t1, t2) )

When we shift the second distribution over to have mean 110, we rejected
the null 463 times out of 1000, far less than the 80% rejection you’d
expect if the data were normally distributed.

These two examples show that you can replace the normal distribution
with a moderately heavy tailed symmetric distribution, but don’t overdo
it. When the data some from a heavy tailed distribution, even one that
is symmetric, the two-sample t-test may not have the operating
characteristics you’d expect.

#johndcook #Math #Statistics #ProbabilityandStatistics
Two-sample t-test and robustness

John D. Cook: Robustness of the two-sample t-test

Hey everyone, I’m #newhere. I’m interested in #dance, #esperanto, #gaming, #linguistics, #math, #music, and #singing.

I run a #nonprofit that specializes in sex education. I'm currently in #school getting a degree in #mathematics. I will be pursuing a career in #datascience after graduation.

I keep on bouncing between these networks, such as friendica, hubzilla, and social home to see which works best for me and, so far, #diaspora itself seems to be my favorite.


Ask HN: How do I learn math/physics in my thirties

I'm in my early thirties and I feel I've not really made any significant effort in learning math/physics beyond the usual curriculum at school. I realize I didn't have the need for it and didn't have the right exposure (environment/friends) that would have inculcated in me these things. And perhaps I was lazy as well all these years to go that extra mile.

I have (had) a fairly good grasp of calculus and trigonometry and did a fairly good job working on a number of problems in high school. But over the past 12-13 years, I've really not had any need to flex my math muscles other than a problem here or there at work. Otherwise it's the same old enterprise software development.

I follow a bunch of folks on the internet and idolize them for their multifaceted personalities - be it math, programming/problem solving, physics, music etc. And these people had a natural flair for math/physics which was nurtured by their environment which made them participate in IOI/ACPC etc. in high school and undergrad which unfortunately I didn't get a taste of. I can totally see that these are the folks who have high IQs and they can easily learn a new domain in a few months if they were put in one.

Instead of ruing missed opportunities, I want to take it under my stride in my thirties to learn math/physics so as to become better at it. I might not have made an effort till now, but I hopefully have another 40 years to flex my muscles. I believe I'm a little wiser than how I was a few years back, so I'm turning to the community for help.

How do I get started? I'm looking to (re)learn the following - calculus, linear algebra, constraint solving, optimization problems, graph theory, discrete math and slowly gain knowledge and expertise to appreciate theoretical physics, astrophysics, string theory etc.

HN link:
Posted by mosconaut (karma: 57)
Post stats: Points: 148 - Comments: 59 - 2018-05-15T16:40:09Z

\#HackerNews #ask #how #learn #math #physics #thirties
HackerNewsBot debug: Calculated post rank: 118 - Loop: 110 - Rank min: 100 - Author rank: 570

Here is my idea for an #esolang, since that #tag is still pretty sparsely used :)
  • The program input and output are both real numbers from 0 to 2 #pi #radians.
  • The input specifies an angle of a #ray of #light that is cast in a 2D plane from the point <0,0>.
  • The #program specifies a set of mirrors in the 2D plane. Each #mirror is specified as a parametric #curve, i.e. the functions x(t) and y(t), where t goes from 0 to 1. (The functions could be #polynomials with maybe also #trigonometric functions allowed?)
  • The output is the angle of the ray after it has made the last bounce, i.e. the angle at #infinity. (If it bounces forever, the program is in an #infiniteloop.)
E.g. an empty program (no mirrors) is an identity #function, since the input angle is also the output angle. A zero function would be a parabolic mirror with the focus at <0,0>. And so on.

The questions are:
  • Is it #turing complete? What would the (idea of the) #proof be?
  • Could an accurate, i.e. non-float #implementation be made? The input would be naturally always finite, but couldn't #irrational numbers arise because of the trig functions or something? Not sure here.
  • Isn't there already such #language?
  • In this lang, how would you program a function y = 2 * pi - x? Feel free to draw :)
  • Would extensions, such as refractions, significantly change the language properties?
  • Would an infinite number of mirrors significantly change the language properties?
  • Any other interesting questions? :)
#math #compsci #programming


Making a career out of the chain rule

When I was a teenager, my uncle gave me a calculus book and told me that
mastering calculus was the most important thing I could do for starting
out in math. So I learned the basics of calculus from that book. Later I
read Michael Spivak’s two calculus books. I took courses that built on
calculus, and studied generalizations of calculus such as calculus with
complex variables, calculus in Banach spaces, etc. I taught calculus.
After a while, I started to feel maybe I’d mastered calculus.

Last year I started digging into automatic differentiation and
questioned whether I really had mastered calculus. At a high level,
automatic differentiation is “just” an application of the chain rule in
many variables. But you can make career out of exploring automatic
differentiation (AD), and many people do. The basics of AD are not that
complicated, but you can go down a deep rabbit hole exploring optimal
ways to implement AD in various contexts.

You can make a career out of things that seem even simpler than AD.
Thousands of people have made a career out of solving the equation Ax
= b where A is a large matrix and the vector b is given. In high
school you solve two equations in two unknowns, then three equations in
three unknowns, and in principle you could keep going. But you don’t
solve a sparse system of millions of equations the same way. When you
consider efficiency, accuracy, limitations of computer arithmetic,
parallel computing, etc. the humble equation Ax = b is not so simple
to solve.

As Richard Feynman said, nearly everything is really interesting if you
go into it deeply enough.

Related posts:Bild/Foto{width="1"
#johndcook #Computing #Math
Making a career out of the chain rule

John D. Cook: Applying the chain rule: automatic differentiation

#Math | The Problem with 7825 -

Hello everyone! I'm #newhere. I'm interested in #cdnpoli #onpoli #punk #music #technology #security #math #software #freesoftware #ethics and #digitalrights among other things. I recently moved here from #facebook and I'm looking forward to making this my new online home for medium-long form content sharing.

I teed off a water line in my basement and connected it to the ice maker solenoid. Since the water contains chloramines, I'll need to run a dosing pump simultaneously with the solenoid valve to add dechlorinator, so I need to calculate the proper concentration. Can someone please check my math? (apologies for the mixed metric/imperal units)

I timed how long it took to fill a 1 gallon jug from the solenoid: 200 seconds. 3600 seconds per hour, divided by 200 = 18 gallons per hour.

I need 1ml of dechlorinator for every 6 gallons of water, so for 18 gallons I need 3ml.

The dosing pump is rated at 3.5 gph, so to dechlorinate 18 gallons of water with 3.5 gallons of dechlorinator, the solution will need to be 3ml per 3.5 gallons, or 0.85ml per gallon.

Did I do that right?

#aquariums #water #ato #solenoid #water #flowrates #chemistry #math #maths #automation #plumbing #diy

I teed off a water line in my basement and connected it to the ice maker solenoid. Since the water contains chloramines, I'll need to run a dosing pump simultaneously with the solenoid valve to add dechlorinator, so I need to calculate the proper concentration. Can someone please check my math? (apologies for the mixed metric/imperal units)

I timed how long it took to fill a 1 gallon jug from the solenoid: 200 seconds. 3600 seconds per hour, divided by 200 = 18 gallons per hour.

I need 1ml of dechlorinator for every 6 gallons of water, so for 18 gallons I need 3ml.

The dosing pump is rated at 3.5 gph, so to dechlorinate 18 gallons of water with 3.5 gallons of dechlorinator, the solution will need to be 3ml per 3.5 gallons, or 0.85ml per gallon.

Did I do that right?

#aquariums #water #ato #solenoid #water #flowrates #chemistry #math #maths #automation #plumbing #diy

| #math #lessons

| #math #lessons


Lessons from My Math Degree That Have Nothing to Do with Math

HN link:
Posted by ginnungagap (karma: 91)
Post stats: Points: 121 - Comments: 59 - 2018-05-19T13:01:47Z

\#HackerNews #degree #from #have #lessons #math #nothing #that #with
HackerNewsBot debug: Calculated post rank: 100 - Loop: 212 - Rank min: 100 - Author rank: 50


Line art

A new video from 3Blue1Brown is about
visualizing derivatives as stretching and shrinking factors. Along the
way they consider the function f(x) = 1 + 1/x.

Iterations of f converge on the golden ratio, no matter where you
start (with one exception). The video creates a graph where they connect
values of x on one line to values of f(x) on another line.
Curiously, there’s an oval that emerges where no lines cross.

Here’s a little Python I wrote to play with this:
import matplotlib.pyplot as plt
from numpy import linspace

N = 70
x = linspace(-3, 3, N)
y = 1 + 1/x

for i in range(N):
plt.plot([x, y[i]], [0, 1])
plt.xlim(-5, 5)

And here’s the output:

Bild/Foto{.aligncenter .size-medium
width="613" height="460"}

In the plot above I just used matplotlib’s default color sequence for
each line. In the plot below, I used fewer lines (N = 50) and specified
the color for each line. Also, I made a couple changes to the plot
command, specifying the color for each line and putting the [i]x line
above the y line.

Bild/Foto{.aligncenter .size-medium
width="613" height="460"}

If you play around with the Python code, you probably want to keep N
even. This prevents the x array from containing zero.

#johndcook #Uncategorized #Math
Line art

John D. Cook: Negative space in golden ratio iteration

is one of the most #beautiful algorithms I learned at Uni. It searches for parametric shapes (such as lines or circles) in images. The shapes can be partially covered or degraded by noise, but it will still find them. The beauty lies in using only a very simple #math and one brilliant idea - using a parameter space. IIRC it works like this (let's suppose we are looking for lines):
  • Initialize an empty parameter space. For line this will be a 2D plane with two axes: offset and angle (two parameters that define a line).
  • Detect edge pixels in the source image - any simple algorithm such as Sobel will work.
  • For each of the detected edge pixels draw into the parameter space a set (in our case a line) containing all possible parameters such that they define a shape that goes through the edge pixel. I.e. each edge pixel in the picture will create a line in the parameter space.
  • Intersections (of the lines) in the parameter space represent parameters of the shapes (lines) present in the picture. E.g. if we find two intersection <p1,p2> and <p3,p4> in the parameter space, we have found two lines whose parameters (offset and angle) are p1, p2 and p3, p4.
Do you have any favorite, maybe lesser known, #algorithm? I'd like to hear :)

#compsci #programming #computervision #computergraphics


Fixed points of logistic function

Here’s an interesting problem that came out of a logistic regression
application. The input variable was between 0 and 1, and someone asked
when and where the logistic transformation

f(x) = 1/(1 + exp(a + bx))

has a fixed point, i.e. f(x) = x.

So given logistic regression parameters a and b, when does the
logistic curve given by y = f(x) cross the line y = x? Do they
always cross? Can they cross twice?

There’s always at least one solution. Because f(x) is strictly
between 0 and 1, the function

g(x) = f(x) – x

is positive at 0 and negative at 1, and by the intermediate value
theorem g(x) must be zero for some x between 0 and 1.

Sometimes f has only one fixed point. It may have two or three fixed
points, as demonstrated in the graph below. The case of two fixed points
is unstable: the logistic curve is tangent to the line y = x at one
point, and a tiny change would turn this tangent point into either no
crossing or two crossings.

.size-medium width="613" height="460"}

If |b| < 1, then you can show that the function f is a
contraction map on [0, 1]. In that case there is a unique solution
to f(x) = x, and you can find it by starting with an arbitrary
value for x and repeatedly applying f to it. For example, if a= 1
and b = 0.8 and we start with x= 0, after applying f ten times we
get x = f(x) = 0.233790157.

There are a couple questions left to finish this up. How can you tell
from a and b how many fixed points there will be? The condition
|b| < 1 is sufficient for f to be a contraction map on [0, 1].
Can you find necessary and sufficient conditions?

#johndcook #Math
Fixed points of logistic function

John D. Cook: Fixed points of logistic function


10 best rational approximations for pi

It’s easy to create rational approximations for π. Every time you write
down π to a few decimal places, that’s a rational approximation. For
example, 3.14 = 314/100. But that’s not the best approximation.

Think of the denominator of your fraction as something you have to buy.
If you have enough budget to buy a three-digit denominator, then you’re
better off buying 355/113 rather than 314/100 because the former is more
accurate. The approximation 355/113 is accurate to 6 decimal places,
whereas 314/100 is only accurate to 2 decimal places.

There’s a way to find the most economical rational approximations to π,
or any other irrational number, using continued fractions. If you’re
interested in details, see the links below.

Here are the 10 best rational approximations to π, found via continued
| Fraction | Decimals |
| 3 | 0.8 |
| 22/7 | 2.9 |
| 333/106 | 4.1 |
| 355/113 | 6.6 |
| 103993/33102 | 9.2 |
| 104348/33215 | 9.5 |
| 208341/66317 | 9.9 |
| 312689/99532 | 10.5 |
| 833719/265381 | 11.1 |
| 1146408/364913 | 11.8 |

If you only want to know the number of correct decimal places, ignore
the fractional parts of the numbers in the Decimals column above. For
example, 22/7 gives two correct decimal places. But it almost gives
three. (Technically, the Decimals column gives the -log~10~ of the
absolute error.)

In case you’re curious, here’s a plot of the absolute errors on a log

.size-medium width="400"}

Related posts

#johndcook #Math
10 best rational approximations for pi

John D. Cook: Best rational approximations for pi


Combinatorics, just beyond the basics

Most basic combinatorial problems can be solved in terms of
multiplication, permutations, and combinations. The next step beyond the
basics, in my experience, is counting selections with
Often when I run into a problem that is not quite transparent, it boils
down to this.

Examples of selection with replacement

Here are three problems that reduce to counting selections with

Looking ahead in an experiment

For example, suppose you’re running an experiment in which you randomize
to n different treatments and you want to know how many ways the
next k subjects can be assigned to treatments. So if you had
treatments A, B, and C, and five subjects, you could assign all
five to A, or four to A and one to B, etc. for a total of 21
possibilities. Your choosing 5 things from a set of 3, with replacement
because you can (and will) assign the same treatment more than once.

Partial derivatives

For another example, if you’re taking the kth order partial
derivatives of a function of n variables, you’re choosing k things
(variables to differentiate with respect to) from a set of n (the
variables). Equality of mixed partials for smooth functions says that
all that matters is how many times you differentiate with respect to
each variable. Order doesn’t matter: differentiating with respect to x
and then with respect to y gives the same result as taking the
derivatives in the opposite order, as long as the function you’re
differentiating has enough derivatives. I wrote about this example

Sums over even terms

I recently had an expression come up that required summing over n
distinct variables, each with a non-negative even value, and summing to
2k. How many ways can that be done? As many as dividing all the
variable values in half and asking that they sum to k. Here the thing
being chosen the variable, and since the indexes sum to k, I have a
total of k choices to make, with replacement since I can chose a
variable more than once. So again I’m choosing k things with
replacement from a set of size n.


I wrote up a set of notes on sampling with

that I won’t duplicate here, but in a nutshell:


The symbol on the left is Richard Stanley’s suggested notation for the
number of ways to select k things with replacement from a set of
size n. It turns out that this equals the expression on the right
side. The derivation isn’t too long, but it’s not completely obvious
either. See the aforementioned notes for details.

I started by saying basic combinatorial problems can be reduced to
multiplication, permutations, and combinations. Sampling with
replacement can be reduced to a combination, the right side of the
equation above, but with non-obvious arguments. Hence the need to
introduce a new symbol, the one on the right, that maps directly to
problem statements.

Enumerating possibilities

Sometimes you just need to count how many ways one can select k things
with replacement from a set of size n. But often you need to actually
enumerate the possibilities, say to loop over them in software.

In conducting a clinical trial, you may want to enumerate all the ways
pending data could turn out. If you’re going to act the same way, no
matter how the pending data work out, there’s no need to wait for the
missing data before proceeding. This is something I did many times when
I worked at MD Anderson Cancer Center.

When you evaluate a multivariate Taylor series, you need to carry out a
sum that has a term for corresponding to each partial derivative. You
could naively sum over all possible derivatives, not taking into account
equality of mixed partials, but that could be very inefficient, as
described here.

The example above summing over even partitions of an even number also
comes out of a problem with multivariate Taylor series, one in which odd
terms drop out by symmetry.

To find algorithms for enumerating selections with replacement, see
Knuth’s TAOCP, volume 4, Fascicle 3.

Related posts

#johndcook #Math
Combinatorics, just beyond the basics

John D. Cook: Combinatorics beyond the basics

How I think we could simulate #Earth #gravity on #Mars:


It's a rotating tilted plane that combines the Mars' default gravity with centrifugal force into a final force that is equal to Earth's gravity force. The problem is that the final gravity slightly differs in strength and angle over the plane due to varying distance from rotation center which determines the centrifugal force, but I guess people could handle that. This can also be reduced by making the rotation radius bigger.

The final solution would be a rotating ring, something like what you see in #2001 Space Oddysey #movie. It would contain rooms for people to live and spend their free time in. They could go outside using a lift that would take them to the center of the rotating ring where the lift would then start counter-rotating against the ring to sync the person with the outside world and allow them to get out, or back in.

Walls on one side would probably have to be stuffed to prevent injury in case something breaks and the ring stops rotating - people would then fall onto walls.

There could also be windows (not #Windows!) to allow a view outside, but you'd be looking either at the sky or into the ground - this could easily be fixed with mirrors but still, the view would be rotating and your head might start spinning. It might also look like you're in a train though, so not sure if it would actually be good or not.

So... did I #fail at the #math or should I call #Elon and get myself hired at #spacex? :D

#physics #space

[Oficina de Matemática | Um blog para armazenar as actividades e os resultados… bons ou menos bons!](



A Classical Math Problem Gets Pulled Into Self-Driving Cars

#cars #classical #driving #gets #into #math #problem #pulled #science #self


Computing smooth max without overflow

Erik Erlandson sent me a note saying he found my post on computing the

helpful. (If you’re unfamiliar with the soft maximum, here’s a brief
what it is and how you might use it.) Erik writes
I used your post on practical techniques for computing smooth max,
which will probably be the difference between being able to actually
use my result and not. I wrote up what I’m planning to do
His post extends the idea in my post to computing the gradient and
Hessian of the soft max.

#johndcook #Math
Computing smooth max without overflow

John D. Cook: Computing smooth max without overflow


Calendars and continued fractions

Calendars are based on three frequencies: the rotation of the Earth on
its axis, the rotation of the moon around the Earth, and the rotation of
the Earth around the sun. Calendars are complicated because none of
these periods is a simple multiple of the other. The ratios are
certainly not integers, but they’re not even fractions with a small

As I wrote about the other day in the context of rational
approximations for
the most economical rational approximations of an irrational number come
from convergents of continued fractions. The book Calendrical
applies continued fractions to
various kinds of calendars.


Ratio of years to days

The continued fraction for the number of days in a year is as follows.

.size-medium width="379" height="173"}

This means that the best approximations for the length of a year are 365
days plus a fraction of 1/4, or 7/29, or 8/33, or 23/95, etc.

You could have one leap day every four years. More accurate would be 7
leap days every 29 years, etc. The Gregorian calendar has 97 leap days
every 400 years.

Ratio of years to months

If we express the ratio of the length of the year to the length of the
month, we get

.size-medium width="374" height="200"}

By taking various convergents we get 37/3, 99/8, 136/11, etc. So if you
want to design successively more accurate lunisolar calendars, you’d
have 37 lunar months every 3 years, 99 lunar months every 8 years, etc.

Related posts

#johndcook #Math #Continuedfractions
Calendars and continued fractions

John D. Cook: Calendars and continued fractions


Central limit theorem and Runge phenomena

I was playing around with something this afternoon and stumbled on
something like Gibbs phenomena or Runge phenomena for the Central Limit

The first place most people encounter Gibbs phenomena is in Fourier
series for a step function. The Fourier series develops “bat ears” near
the discontinuity. Here’s an example I blogged about before not with
Fourier series but with analogous Chebyshev series.


The series converges rapidly in the middle of the flat parts, but
under-shoots and over-shoots near the jumps in the step function.

Runge phenomena is similar, where interpolating functions under- and
over-shoot the function they’re approximating.


Both plots above come from this

Here’s the example I ran across with the central limit theorem. The
distribution of the average of a set of exponential random variables
converges to the distribution of a normal random variable. The nice
thing about the exponential distribution is that the averages have a
familiar distribution: a gamma distribution. If each exponential has
mean 1, the average has a gamma distribution with shape N and scale
1/N. The central limit theorem says this is converging in distribution
to a normal distribution with the same mean and variance.

The plot below shows the difference between the density function of the
average of N exponential random variables and the density function for
its normal approximation, for N = 10 and for N = 400.

.size-medium width="613" height="460"}

Notice that the orange line, corresponding to N = 400, is very flat,
most of the time. That is, the normal approximation fits very well. But
there’s this spike in the middle, something reminiscent of Gibbs
phenomena or Runge phenomena. Going from 10 to 400 samples the average
error decreases quite a bit, but the maximum error doesn’t go down much
at all.

If you go back and look at the Central Limit Theorem, or its more
quantitative counterpart the Berry-Esseen theorem, you’ll notice that it
applies to the distribution function, not the density, or in other
words, the CDF, not the PDF. I think the density functions do converge
in this case because the exponential function has a smooth density, but
the rate of convergence depends on the norm. It looks like the
convergence is fast in square (L²) norm, but slow in sup norm. A
little experiment shows that this is indeed the case.

.size-medium width="613" height="460"}

Maybe the max norm doesn’t converge at all, i.e. the densities don’t
converge pointwise. It looks like the max norm may be headed toward a
horizontal asymptote, just like Gibbs phenomena.

Related posts

#johndcook #Math #ProbabilityandStatistics
Central limit theorem and Runge phenomena

John D. Cook: Central limit theorem and Runge or Gibbs phenomena


The Scully Effect

The effect of Dana Scully (from X-Files) as role model for women

Research by 21st Century Fox, Geena Davis Institute on Gender in Media, and J. Walter Thompson Intelligence. This is a piece of observational research rather than a controlled experiment, but interesting nonetheless.

* Among women who are familiar with Scully’s character, 63% say Scully increased their confidence that they could excel in a male-dominated profession. The same percentage said that Dana Scully served as their role model.
* Medium/heavy viewers of X-Files are more likely (24%) to have worked in a STEM field than non/light viewers (16%).
* Medium/heavy viewers of X-Files are more likely (53%) to strongly agree with the statement "I would encourage my daughter/granddaughter to enter a STEM field" than non/light viewers (41%).

Scully's character was a sharp departure from the gendered stereotype of a scientist and paved the way for other characters like her on television, such as Veronica Mars, Dr. Temperance "Bones" Brennan (Bones), Sydney Bristow (Alias), and Zoe Washburne (Firefly).

#x-files #xfiles #scully #women #gender #equality #role-model #research #stem #science #technology #engineering #mathematics #maths #math


Relative error in the central limit theorem

If you average a large number independent versions of the same random
variable, the central limit theorem says the average will be
approximately normal. That is the absolute error in approximating the
density of the average by the density of a normal random variable will
be small. (Terms and conditions apply. See notes

But the central limit theorem says nothing about relative error.
Relative error can diverge to infinity while absolute error converges to
zero. We’ll illustrate this with an example.

The average of N independent exponential(1) random variables has a
gamma distribution with shape N and scale 1/N.

As N increases, the average becomes more like a normal in
distribution. That is, the absolute error in approximating the
distribution function of gamma random variable with that of a normal
random variable decreases. (Note that we’re talking about distribution
functions (CDFs) and not densities (PDFs). The previous
discussed a
surprise with density functions in this example.)

The following plot shows that the difference between the distributions
functions get smaller as N increases.

.size-medium width="613" height="460"}

But when we look at the ratio of the tail probabilities, that is
Pr(X > t) / Pr(Y > t) where X is the average of N
exponential r.v.s and Y is the corresponding normal approximation from
the central limit theorem, we see that the ratios diverge, and they
diverge faster as N increases.

.size-medium width="613" height="460"}

To make it clear what’s being plotted, here is the Python code used to
draw the graphs above.
import matplotlib.pyplot as plt
from scipy.stats import gamma, norm
from scipy import linspace, sqrt

def tail_ratio(ns):

x = linspace(0, 4, 400)

for n in ns:
gcdf = gamma.sf(x, n, scale = 1/n)
ncdf = norm.sf(x, loc=1, scale=sqrt(1/n))
plt.plot(x, gcdf/ncdf

plt.legend(["n = {}".format(n) for n in ns])

def cdf_error(ns):

x = linspace(0, 6, 400)

for n in ns:
gtail = gamma.cdf(x, n, scale = 1/n)
ntail = norm.cdf(x, loc=1, scale=sqrt(1/n))
plt.plot(x, gtail-ntail)

plt.legend(["n = {}".format(n) for n in ns])

ns = [1, 4, 16]

#johndcook #Math #ProbabilityandStatistics
Relative error in the central limit theorem

John D. Cook: Relative error in the central limit theorem


How the Math Men Overthrew the Mad Men

Advertising has always been about the search for perfect targeting data, paving the way for the annihilating power of Google and Facebook.
Article word count: 2583

HN Discussion:
Posted by sonabinu (karma: 6061)
Post stats: Points: 132 - Comments: 39 - 2018-06-04T17:53:21Z

\#HackerNews #how #mad #math #men #overthrew #the
Article content:

Once, Mad Men ruled advertising. They’ve now been eclipsed by Math Men—the engineers and data scientists whose province is machines, algorithms, pureed data, and artificial intelligence. Yet Math Men are beleaguered, as Mark Zuckerberg demonstrated when he [1]humbled himself before Congress, in April. Math Men’s adoration of data—coupled with their truculence and an arrogant conviction that their “science” is nearly flawless—has aroused government anger, much as Microsoft did two decades ago.

The power of Math Men is awesome. Google and Facebook each has a market value exceeding the combined value of the six largest advertising and marketing holding companies. Together, they claim six out of every ten dollars spent on digital advertising, and nine out of ten new digital ad dollars. They have become more dominant in what is estimated to be an up to two-trillion-dollar annual global advertising and marketing business. Facebook alone generates more ad dollars than all of America’s newspapers, and Google has twice the ad revenues of Facebook.

In the advertising world, Big Data is the Holy Grail, because it enables marketers to target messages to individuals rather than general groups, creating what’s called addressable advertising. And only the digital giants possess state-of-the-art Big Data. “The game is no longer about sending you a mail order catalogue or even about targeting online advertising,” Shoshana Zuboff, a professor of business administration at the Harvard Business School, [2]wrote on, in 2016. “The game is selling access to the real-time flow of your daily life—your reality—in order to directly influence and modify your behavior for profit.” Success at this “game” flows to those with the “ability to predict the future—specifically the future of behavior,” Zuboff writes. She dubs this “surveillance capitalism.”

However, to thrash just Facebook and Google is to miss the larger truth: everyone in advertising strives to eliminate risk by perfecting targeting data. Protecting privacy is not foremost among the concerns of marketers; protecting and expanding their business is. The business model adopted by ad agencies and their clients parallels Facebook and Google’s. Each aims to massage data to better identify potential customers. Each aims to influence consumer behavior. To appreciate how alike their aims are, sit in an agency or client marketing meeting and you will hear wails about Facebook and Google’s “walled garden,” their unwillingness to share data on their users. When Facebook or Google counter that they must protect “the privacy” of their users, advertisers cry foul: You’re using the data to target ads we paid for—why won’t you share it, so that we can use it in other ad campaigns?

This preoccupation with Big Data is also revealed by the trend in the advertising-agency business to have the media agency, not the creative Mad Men team, occupy the prime seat in pitches to clients, because it’s the media agency that harvests the data to help advertising clients better aim at potential consumers. Agencies compete to proclaim their own Big Data horde. W.P.P.’s GroupM, the largest media agency, has quietly assembled what it calls its “secret sauce,” a collection of forty thousand personally identifiable attributes it plans to retain on two hundred million adult Americans. Unlike Facebook or Google, GroupM can’t track most of what we do online. To parade their sensitivity to privacy, agencies reassuringly boast that they don’t know the names of people in their data bank. But they do have your I.P. address, which yields abundant information, including where you live. For marketers, the advantage of being able to track online behavior, the former senior GroupM executive Brian Lesser said—a bit hyperbolically, one hopes—is that “we know what you want even before you know you want it.”

Worried that Brian Lesser’s dream will become a nightmare, ProPublica has ferociously chewed on the Big Data privacy menace like a dog with a bone: in its series “[3]Breaking the Black Box,” it wrote, “Facebook has a particularly comprehensive set of dossiers on its more than two billion members. Every time a Facebook member likes a post, tags a photo, updates their favorite movies in their profile, posts a comment about a politician, or changes their relationship status, Facebook logs it . . . When they use Instagram or WhatsApp on their phone, which are both owned by Facebook, they contribute more data to Facebook’s dossier.” Facebook offers advertisers more than thirteen hundred categories for ad targeting, according to ProPublica.

Google, for its part, has merged all the data it collects from its search, YouTube, and other services, and has introduced an About Me page, which includes your date of birth, phone number, where you work, mailing address, education, where you’ve travelled, your nickname, photo, and e-mail address. Amazon knows even more about you. Since it is the world’s largest store and sees what you’ve actually purchased, its data are unrivalled. Amazon reaches beyond what interests you (revealed by a Google search) or what your friends are saying (on Facebook) to what you actually purchase. With Amazon’s Alexa, it has an agent in your home that not only knows what you bought but when you wake up, what you watch, read, listen to, ask for, and eat. And Amazon is aggressively building up its meagre ad sales, which gives it an incentive to exploit its data.

Data excite advertisers. Prowling his London office in jeans, Keith Weed, who oversees marketing and communications for Unilever, one of the world’s largest advertisers, described how mobile phones have elevated data as a marketing tool. “When I started in marketing, we were using secondhand data which was three months old,” he said. “Now with the good old mobile, I have individualized data on people. You don’t need to know their names . . . You know their telephone number. You know where they live, because it’s the same location as their PC.” Weed knows what times of the day you usually browse, watch videos, answer e-mail, travel to the office—and what travel routes you take. “From your mobile, I know whether you stay in four-star or two-star hotels, whether you go to train stations or airports. I use these insights along with what you’re browsing on your PC. I know whether you’re interested in horses or holidays in the Caribbean.” By using programmatic computers to buy ads targeting these individuals, he says, Unilever can “create a hundred thousand permutations of the same ad,” as they recently did with a thirty-second TV ad for Axe toiletries aimed at young men in Brazil. The more Keith Weed knows about a consumer, the better he can aim to target a sale.

Engineers and data scientists vacuum data. They see data as virtuous, yielding clues to the mysteries of human behavior, suggesting efficiencies (including eliminating costly middlemen, like agency Mad Men), offering answers that they believe will better serve consumers, because the marketing message is individualized. The more cool things offered, the more clicks, the more page views, the more user engagement. Data yield facts and advance a quest to be more scientific—free of guesses. As Eric Schmidt, then the executive chairman of Google’s parent company, Alphabet, said at the company’s 2017 shareholder meeting, “We start from the principles of science at Google and Alphabet.”

They believe there is nobility in their quest. By offering individualized marketing messages, they are trading something of value in exchange for a consumer’s attention. They also start from the principle, as the TV networks did, that advertising allows their product to be “free.” But, of course, as their audience swells, so does their data. Sandy Parakilas, who was Facebook’s operations manager on its platform team from 2011 to 2012, put it this way in [4]a scathing Op-Ed for the Times, last November: “The more data it has on offer, the more value it creates for advertisers. That means it has no incentive to police the collection or use of that data—except when negative press or regulators are involved.” For the engineers, the privacy issue—like “fake news” and even fraud—was relegated to the nosebleed bleachers.

With a chorus of marketers and citizens and governments now roaring their concern, the limitations of Math Men loom large. Suddenly, governments in the U.S. are almost as alive to privacy dangers as those in Western Europe, confronting Facebook by asking how the political-data company Cambridge Analytica, employed by Donald Trump’s Presidential campaign, was able to [5]snatch personal data from eighty-seven million individual Facebook profiles. Was Facebook blind—or deliberately mute? Why, they are really asking, should we believe in the infallibility of your machines and your willingness to protect our privacy?

Ad agencies and advertisers have long been uneasy not just with the “walled gardens” of Facebook and Google but with their unwillingness to allow an independent company to monitor their results, as Nielsen does for TV and comScore does online. This mistrust escalated in 2016, when it emerged that Facebook and Google charged advertisers for ads that tricked other machines to believe an ad message was seen by humans when it was not. Advertiser confidence in Facebook was further jolted later in 2016, when it was revealed that the Math Men at Facebook overestimated the average time viewers spent watching video by up to eighty per cent. And in 2017, Math Men took another beating when news broke that Google’s YouTube and Facebook’s machines were inserting friendly ads on unfriendly platforms, including racist sites and porn sites. These were ads targeted by keywords, like “Confederacy” or “race”; placing an ad on a history site might locate it on a Nazi-history site.

The credibility of these digital giants was further subverted when Russian trolls proved how easy it was to disseminate “fake news” on social networks. When told that Facebook’s mechanized defenses had failed to screen out disinformation planted on the social network to sabotage Hillary Clinton’s Presidential campaign, Mark Zuckerberg publicly dismissed the assertion as “pretty crazy,” a position he later conceded was wrong.

By the spring of 2018, Facebook had lost control of its narrative. Their original declared mission—to “connect people” and “build a global community”—had been replaced by an implicit new narrative: we connect advertisers to people. It took Facebook and Google about five years before they figured out how to generate revenue, and today roughly ninety-five per cent of Facebook’s dollars and almost ninety per cent of Google’s comes from advertising. They enjoy abundant riches because they tantalize advertisers with the promise that they can corral potential customers. This is how Facebook lured developers and app makers by offering them a permissive Graph A.P.I., granting them access to the daily habits and the interchange with friends of its users. This Graph A.P.I. is how Cambridge Analytica got its paws on the data of eighty-seven million Americans.

The humiliating furor this news provoked has not subverted the faith among Math Men that their “science” will prevail. They believe advertising will be further transformed by new scientific advances like artificial intelligence that will allow machines to customize ads, marginalizing human creativity. With algorithms creating profiles of individuals, Airbnb’s then chief marketing officer, Jonathan Mildenhall, told me, “brands can engineer without the need for human creativity.” Machines will craft ads, just as machines will drive cars. But the ad community is increasingly mistrustful of the machines, and of Facebook and Google. During a presentation at Advertising Week in New York this past September, Keith Weed offered a report to Facebook and Google. He gave them a mere “C” for policing ad fraud, and a harsher “F” for cross-platform transparency, insisting, “We’ve got to see over the walled gardens.”

That mistrust has gone viral. A powerful case for more government regulation of the digital giants was made by The Economist, a classically conservative publication that also endorsed the government’s antitrust prosecution of Microsoft, in 1999. The magazine [6]editorialized, in May, 2017, that governments must better police the five digital giants—Facebook, Google, Amazon, Apple, and Microsoft—because data were “the oil of the digital era”: “Old ways of thinking about competition, devised in the era of oil, look outdated in what has come to be called the ‘data economy.’ ” Inevitably, an abundance of data alters the nature of competition, allowing companies to benefit from network effects, with users multiplying and companies amassing wealth to swallow potential competitors.

The politics of Silicon Valley is left of center, but its disdain for government regulation has been right of center. This is changing. A Who’s Who of Silicon notables—Tim Berners-Lee, Tim Cook, Ev Williams, Sean Parker, and Tony Fadell, among others—have harshly criticized the social harm imposed by the digital giants. Marc Benioff, the C.E.O. of—echoing similar sentiments expressed by Berners-Lee—[7]has said, “The government is going to have to be involved. You do it exactly the same way you regulated the cigarette industry.”

Cries for regulating the digital giants are almost as loud today as they were to break up Microsoft in the late nineties. Congress insisted that Facebook’s Zuckerberg, not his minions, testify. The Federal Trade Commission is investigating Facebook’s manipulation of user data. Thirty-seven state attorneys general have [8]joined a demand to learn how Facebook safeguards privacy. The European Union has imposed huge fines on Google and wants to inspect Google’s crown jewels—its search algorithms—claiming that Google’s search results are skewed to favor their own sites. The E.U.’s twenty-eight countries this May imposed a General Data Protection Regulation to protect the privacy of users, requiring that citizens must choose to opt in before companies can horde their data.

Here’s where advertisers and the digital giants lock arms: they speak with one voice in opposing opt-in legislation, which would deny access to data without the permission of users. If consumers wish to deny advertisers access to their cookies—their data—they agree: the consumer must voluntarily opt out, meaning they must endure a cumbersome and confusing series of online steps. Amid the furor about Facebook and Google, remember these twinned and rarely acknowledged truisms: more data probably equals less privacy, while more privacy equals less advertising revenue. Thus, those who rely on advertising have business reasons to remain tone-deaf to privacy concerns.

Those reliant on advertising know: the disruption that earlier slammed the music, newspaper, magazine, taxi, and retail industries now upends advertising. Agencies are being challenged by a host of competitive frenemies: by consulting and public-relations companies that have jumped into their business; by platform customers like Google and Facebook but also the Times, NBC, and Buzzfeed, that now double as ad agencies and talk directly to their clients; by clients that increasingly perform advertising functions in-house.

But the foremost frenemy is the public, which poses an existential threat not just to agencies but to Facebook and the ad revenues on which most media rely. Citizens protest annoying, interruptive advertising, particularly on their mobile phones—a device as personal as a purse or wallet. An estimated twenty per cent of Americans, and one-third of Western Europeans, employ ad-blocker software. More than half of those who record programs on their DVRs choose to skip the ads. Netflix and Amazon, among others, have accustomed viewers to watch what they want when they want, without commercial interruption.

Understandably, those dependent on ad dollars quake. The advertising and marketing world scrambles for new ways to reach consumers. Big Data, they believe, promises ways they might better communicate with annoyed consumers—maybe unlock ways that ads can be embraced as a useful individual service rather than as an interruption. If Big Data’s use is circumscribed to protect privacy, the advertising business will suffer. In this core conviction, at least, Mad Men and Math Men are alike.

This piece is partially adapted from Auletta’s forthcoming book, “[9]Frenemies: The Epic Disruption of the Ad Business (and Everything Else).”


Visible links

HackerNewsBot debug: Calculated post rank: 101 - Loop: 448 - Rank min: 100 - Author rank: 49


Bell numbers

The nth Bell number is the number of ways to partition a set of
size n. It’s also equal to the following sum.

Bild/Foto{.aligncenter .size-medium
width="110" height="49"}

You may have to look at that sum twice to see it correctly. It looks a
lot like the sum for e^n^ except the rolls of k and n are
reversed in the numerator.

The sum reminded me of the equation


that I
about a few weeks ago. It’s correct, but you may have to stare at it a
little while to see why.

Incidentally, the nth Bell number is the nth moment of a Poisson
random variable
with mean 1.

There’s also a connection with Bell
The nth Bell number is the sum of the coefficients of the nth
complete Bell polynomial. The nth Bell polynomial is sum over k
of the partial exponential Bell polynomials B~n,k~. The partial
(exponential) Bell polynomials are defined

#johndcook #Math
Bell numbers

John D. Cook: Bell numbers


Tetrahedral numbers

Start with the sequence of positive integers:

1, 2, 3, 4, …

Not take partial sums, the nth term of the new series being the sum of
the first n terms of the previous series. This gives us the triangular
numbers, so called because they count the number of coins at each level
of a triangular arrangement:

1, 3, 6, 10, …

If we repeat this again we get the tetrahedral numbers, the number of
balls on each level of a tetrahedral stack of balls.

1, 4, 10, 20, …

We can repeat this process and general define T~n,\ d~, the nth
tetrahedral number of dimension d, recursively. We
define T~n,\ 1~ = n and for d > 1,

.size-medium width="187" height="49"}

This is just a formalization of the discussion above.

It turns out there’s a simple expression for tetrahedral number of all

.size-medium width="176" height="43"}

Here’s a quick proof by induction. The theorem clearly holds when n =
1 or d = 1. Now assume it hold for all n < m and d < m.

.size-medium width="334"}

The last line follows from the binomial addition identity

.size-medium width="214" height="43"}

It also turns out that T~n,\ d~ is the number of ways to
select d things from a set of n with

Related posts:Bild/Foto{width="1"
#johndcook #Math
Tetrahedral numbers

John D. Cook: Tetrahedral number formula


Stirling numbers, including negative arguments

Stirling numbers are something like binomial coefficients. They come in
two varieties, imaginatively called the first kind and second kind.
Unfortunately it is the second kind that are simpler to describe and
that come up more often in applications, so we’ll start there.

Stirling numbers of the second kind

The Stirling number of the second kind

.size-medium width="121" height="33"}

counts the number of ways to partition a set of n elements into k
non-empty subsets. The notation on the left is easier to use inline, and
the subscript reminds us that we’re talking about Stirling numbers of
the second kind. The notation on the right suggests that we’re dealing
with something analogous to binomial coefficients, and the curly braces
suggest this thing might have something to do with counting sets.

Since the nth Bell
the total number of ways to partition a set of n elements into any
number of non-empty subsets, we have

.size-medium width="113" height="49"}

Stirling numbers of the first kind

The Stirling numbers of the first kind

.size-medium width="115" height="33"}

count how many ways to partition a set into cycles rather than
subsets. A cycle is a sort of ordered subset. The order of elements
matters, but only a circular way. A cycle of size k is the number of
ways to place k items evenly around a circle, where two cycles are
considered the same if you can rotate one into the other. So, for
example, [1, 2, 3] and [2, 3, 1] represent the same cycle, but [1,
3, 3] and [1, 3, 2] represent different cycles.

Since a set with at least three elements can be arranged into multiple
cycles, Stirling numbers of the first kind are greater than or equal to
Stirling numbers of the second kind, given the same arguments.

Addition identities

We started out by saying Stirling numbers were like binomial
coefficients, and here we show that they satisfy addition identities
similar to binomial coefficients.

For binomial coefficients we have

.size-medium width="223" height="43"}

To see this, imagine we start with the set of the numbers 1 through n.
How many ways can we select a subset of k items? We have selections
that exclude the number 1 and selections that include the number 1.
These correspond to the two terms on the right side of the identity.

The analogous identities for Stirling numbers are

.size-medium width="240" height="43"}

.size-medium width="262" height="43"}\
The combinatorial proofs of these identities are similar to the argument
above for binomial coefficients. If you want to partition the numbers 1
through n into k subsets (or cycles), either 1 is in a subset
(cycle) by itself or not.

More general arguments

Everything above has implicitly assumed n and k were positive, or at
least non-negative, numbers. Let’s look first at how we might handle
zero arguments, then negative arguments.

It works out best if we define S~1~(0, k) and S~2~(0, k) to be 1
if k is 0 and zero otherwise. Also we define S~1~(n, 0) and
S~2~(n, 0) to be 1 if n is 0 and zero otherwise. (See Concrete

When either n or k is zero the combinatoric interpretation is
strained. If we let either be negative, there is no combinatorial
interpretation, though it can still be useful to consider negative
arguments, much like one does with binomial

We can take the addition theorems above, which are theorems for
non-negative arguments, and treat them as definitions for negative
arguments. When we do, we get the following beautiful dual relationship
between Stirling numbers of the first and second kinds:

.size-medium width="109" height="43"}

Related posts

#johndcook #Math
Stirling numbers, including negative arguments

John D. Cook: Stirling numbers of the first and second kind

Je sais le slogan que je vais balancer la prochaine fois que j'irrais a une manifestation :

"continu partout ! dérivable non part !"

#humour #math #slogan


Mathematics of Deep Note

Bild/Foto{.size-medium width="500"

I just finished listening to the latest
episode of Twenty Thousand
Hertz, the story behind “Deep Note,” the THX logo sound.

There are a couple mathematical details of the sound that I’d like to
explore here: random number generation, and especially Pythagorean

Random number generation

First is that part of the construction of the sound depended on a random
number generator. The voices start in a random configuration and slowly
reach the target D major chord at the end.

Apparently the random number generator was not seeded in a reproducible
way. This was only mentioned toward the end of the show, and a teaser
implies that they’ll go more into this in the next episode.

Pythagorean tuning

The other thing to mention is that the final chord is based on
Pythagorean tuning, not the more familiar equal temperament.

The lowest note in the final chord is D1. (Here’s an explanation of
musical pitch
The other notes are D2, A2, D3, A3, D4, A4, D5, A5, D6, and F\#6.


Octave frequencies are a ratio of 2:1, so if D1 is tuned to 36 Hz, then
D2 is 72 Hz, D3 is 144 Hz, D4 is 288 Hz, D5 is 576 Hz, and D6 is 1152


In Pythagorean tuning, fifths are in a ratio of 3:2. In equal
temperament, a fifth is a ratio of 2^7/12^ or 1.4983 [1], a little
less than 3/2. So Pythagorean fifths are slightly bigger than equal
temperament fifths. (I explain all this

If D2 is 72 Hz, then A2 is 108 Hz. It follows that A3 would be 216 Hz,
A4 would be 432 Hz (flatter than the famous A 440), and A5 would be 864

Major thirds

The F\#6 on top is the most interesting note. Pythagorean tuning is
based on fifths being a ratio of 3:2, so how do you get the major third
interval for the highest note? By going up by fifths 4 times from D4,
i.e. D4 -> A4 -> E5 -> B5 -> F\#6.

The frequency of F\#6 would be 81/16 of the frequency of D4, or 1458 Hz.

The F\#6 on top has a frequency 81/64 that of the D# below it. A
Pythagorean major third is a ratio of 81/64 = 1.2656, whereas an equal
temperament major third is f 2^4/12^ or 1.2599 [2]. Pythagorean tuning
makes more of a difference to thirds than it does to fifths.

A Pythagorean major third is sharper than a major third in equal
temperament. Some describe Pythagorean major chords as brighter or
sweeter than equal temperament chords. That the effect the composer was
going for and why he chose Pythagorean tuning.


Then after specifying the exact pitches for each note, the composer
actually changed the pitches of the highest voices a little to make the
chord sound fuller. This makes the three voices on each of the highest
notes sound like three voices, not just one voice. Also, the chord
shimmers a little bit because the random effects from the beginning of
Deep Note never completely stop, they are just diminished over time.

Related posts

[1]The exponent is 7/12 because a half step is 1/12 of an octave, and
a fifth is 7 half steps.

[2]The exponent is 4/12 because a major third is 4 half steps.

Musical score above via THX Ltd on

#johndcook #Math #Music #Acoustics
Mathematics of Deep Note

John D. Cook: Mathematics of the THX Deep Note logo sound


Partition numbers and Ramanujan’s approximation

The partition function p(n) counts the number of ways n unlabeled
things can be partitioned into non-empty sets. (Contrast with Bell
count partitions of labeled things.)

There’s no simple expression for p(n), but Ramanujan discovered a
fairly simple asymptotic approximation:

.size-medium width="213" height="40"}

How accurate is this approximation? Here’s a little Matheamtica code to
p [n_]:= PartitionsP
[n]approx [n_]:= Exp[ Pi Sqrt[2 n/3]] / (4 n Sqrt[3])
relativeError [n_]:= Abs[p [n]- approx[n]]/p
[n]ListLinePlot[Table[relativeError[n], {n, 100}]]

.size-medium width="479" height="295"}

So for values of n around 100, the approximation is off by about 5%.

Since it’s an asymptotic approximation, the relative error decreases
(albeit slowly, apparently) as n increases. The relative error for n
= 1,000 is about 1.4% and the relative error for n = 1,000,000 is
about 0.044%.

#johndcook #Math #Mathematica #Numbertheory
Partition numbers and Ramanujan’s approximation

John D. Cook: Partition numbers and Ramanujan's approximation


Almost prime generators and almost integers

Here are two apparently unrelated things you may have seen before. The
first is an observation going back to Euler that the polynomial

.size-medium width="89" height="17"}

produces a long sequence of primes. Namely, the values are prime for n
= 1, 2, 3, …, 40.

The second is that the number

.size-medium width="46" height="18"}

is extraordinarily close to an integer. This number is known as
Ramanujan’s constant. It differs from its nearest integer by 3 parts in
10^30^. Ramanujan’s constant equals


There is a connection between these two facts: The polynomial

.size-medium width="81" height="17"}

returns primes for n = 1, 2, 3, …, k-1 primes if 4k – 1 is a
Heegner number, and

.size-medium width="33" height="18"}

is almost an integer if d is a (large) Heegner number.

Source: The Book of Numbers by Conway and

Heegner numbers

So what’s a Heegner number and how many are there? An integer d is a
Heegner number if the ring generated by appending √-d to the integers
has unique factorization. There are nine such numbers:

1, 2, 3, 7, 11, 19, 43, 67, 163.

There’s deeper stuff going on here than I understand—modular forms, the
j-function, etc.—so this post won’t explain everything. There’s
something unsatisfying about saying something is “almost” an integer
without quantifying. There’s a way to be more precise, but we won’t go
there. Instead, we’ll just play with the results.

Mathematica computation

First we look at the claim that n² – n + k produces primes for n
= 1 through k – 1 if 4k – 1 is a Heegner number. The Heegner numbers
of the form 4k + 1 are 2, 3, 5, 11, and 17. The following code shows
that the claim is true for these values of k.
k = {2, 3, 5, 11, 17}
claim [x_]:= AllTrue[
Table[n^2 - n + x, {n, x - 1}],
AllTrue[k, claim]

This returns True, so the claim is true.

As for exp(π √d) being close to an integer, this apparently only true
for the last three Heegner numbers.
For[i = 1, i < 10, i++,
Exp[ Pi Sqrt[ heegner[[i]] ] ],

(The function AccountingForm suppresses scientific notation, making it
easier to see where the decimal portion of the number starts.)

Here are the results:

I manually edited the output to align the decimal points and truncate
the decimal places beyond that needed to show that the last number is
not an integer.

#johndcook #Math #Mathematica #Numbertheory
Almost prime generators and almost integers

John D. Cook: Almost prime generators and almost integers


Low-rank matrix perturbations

Here are a couple of linear algebra identities that can be very useful,
but aren’t that widely known, somewhere between common knowledge and
arcane. Neither result assumes any matrix has low rank, but their most
common application, at least in my experience, is in the context of
something of low rank added to something of full rank.

Sylvester’s determinant theorem

The first is Sylvester’s determinant theorem. If A is a n
by k matrix and B is a k by n matrix, then

Bild/Foto{.aligncenter .size-medium
width="230" height="18"}

where I is an identity matrix whose dimension is given by its
subscript. The theorem is true for any k, but it’s especially useful
when k is small because the matrices on the right side are of
size k. If k = 1, the right side is the determinant of a 1 by 1
matrix, i.e. just a number!

Woodbury matrix inversion lemma

The second is known as the matrix inversion lemma or Woodbury’s matrix
identity. It says

.size-medium width="422" height="24"}

which is a lot to take in at once. We’ll unpack it a little at a time.

First of all, the matrices have whatever properties are necessary for
the equation to make sense: A and C are square, invertible matrices,
though possibly of different sizes.

Suppose A is n by n. Then U necessarily has n rows. Let k be
the number of columns in U. Then C is k by k and V is k
by n.

To simplify things a little, let C be the identity.

.size-medium width="386" height="24"}

Think of k as small, maybe even 1. Then UV is a low-rank
perturbation of A. Suppose you have computed the inverse of A, or
know something about the inverse of A, and want to know how that
inverse changes when you change A by adding a rank k matrix to it.

To simplify things further, assume A is the identity matrix. Now the
matrix inversion lemma reduces to something similar to Sylvester’s
theorem above.

Bild/Foto{.aligncenter .size-medium
width="268" height="21"}

To make things clearer, we’ll add subscripts on the identity matrices as

Bild/Foto{.aligncenter .size-medium
width="289" height="21"}

If k is small, then the matrix between U and V on the right side
is small. If k = 1, it’s just a number, and its inverse is just it’s

Related posts

#johndcook #Math #Linearalgebra
Low-rank matrix perturbations

John D. Cook: Low rank matrix peturbations | Sylvester, Woodbury


Magical learning

I asked two questions on twitter yesterday. The previous
summarized the
results for a question about books that I asked from my personal Twitter

This post will summarize the results of a question I asked from
If a genie offered to give you a thorough understanding of one
theorem, what theorem would you choose?

— Analysis Fact (@AnalysisFact) June 15,
Here are the results by category.

Computer science

  • Curry-Howard correspondence
  • P=NP or not
  • Collatz Conjecture


  • Cohen’s forcing theorem
  • Godel’s Incompleteness theorem
  • Continuum hypothesis
  • Zorn’s lemma

Algebra and number theory

  • The ABC conjecture (theorem?)
  • Prime number theorem.
  • Riemann hypothesis
  • Fundamental theorem of finite abelian groups
  • Classification of finite simple groups
  • Fermat’s last theorem, the unpublished Fermat version

Topology and differential geometry

  • Thurston’s geometrization conjecture
  • Gauss Bonnet theorem
  • de Rham’s theorem
  • Grothendieck Riemann Roch theorem


  • Banach-Tarski theorem.
  • Stokes theorem
  • Carleson-Hunt theorem
  • The epsilon/delta definition of continuity
  • Universal approximation theorem

Differential equations

  • Navier–Stokes existence and smoothness postulate
  • The relativistic version of the Shrodinger equation
  • Atiyah-Singer index theorem

Mathematical physics

  • E = mc²
  • Noether’s theorem
  • Liouville’s theorem


  • Existence of general equilibrium prices
  • Farkas’ lemma
  • The graph minor theorem
  • Central limit theorem
#johndcook #Math

#Interesting #website that lets you generate #random #math #formula that is then plotted as #art. The picture is generated just from a textual string. Each piece is therefore defined completely by its name. For example, my username, drummyfish, generates the following #picture:


Try it out :) Post interesting results.

#Tip: if you want to draw the picture in bigger resolution, right click on it -> inspect -> change the canvas size to whatever you want. But beware, it takes a long time to draw.

#procedural #graphics


The Math of Card Shuffling

Riffling from factory order to complete randomness.
Article word count: 748

HN Discussion:
Posted by danso (karma: 94295)
Post stats: Points: 132 - Comments: 26 - 2018-06-17T16:29:24Z

\#HackerNews #card #math #shuffling #the
Article content:

You’ve probably seen a few ways to shuffle a deck of cards. Sometimes the deck is split in half and the halves are switched. Sometimes the deck is [1]smooshed until it’s all mixed up. But most of the time, a deck of cards is shuffled using a riffle.

Here’s a question: how many times do you have to riffle a deck of cards before it is completely shuffled? It’s a tricky one, but math has us covered: [2]you need seven riffles.

Riffle seven times and you’ll have a sufficiently random ordering of cards, an ordering that has likely [3]never existed before. In other words, it’s unlikely you’ll ever shuffle two decks the same.

The card shuffling result appears in a [4]Numberphile video from 2015, along with a number of other card shuffling facts. Here’s another problem posed in that video: what if instead of a standard riffle using a deck roughly split in half, you were to only riffle 1 card at a time?

That is, using a standard deck of 52 cards, in one hand hold 51 cards and in the other hold the remaining single card. Now riffle these together. This is equivalent to taking one card and placing it at random inside of the deck.

So here’s the question:
How many single riffles do you have to do in order to have a completely shuffled deck?


You could simulate this in a short program, which we will do towards the end, but first we can solve for the number of riffles explicitly.

Consider an ordered deck of cards. [5]Without loss of generality, let’s say the suits are in the following order: ♠, ♣, ♥, ♦. So our ordered deck looks like this.

The bottom suit is ♦, which means the bottom card of our deck is the King of Diamonds (K♦). Now perform the following iteration:
Place the top card of the deck randomly inside the deck
This means taking the A♠ and placing it randomly somewhere in the deck. The top card then becomes 2♠.

If this procedure is repeated, eventually the top card will be placed at the very bottom of the deck, shifting the K♦ to the penultimate position. Since every riffled card has a 152\frac{1}{52}521​ chance of moving to any new position in the deck, that means, on average, after about 52 top card riffles, the top card will become the new bottom card.

Once the K♦ moves up one position, upon subsequent riffles there are now two spots for the new top card to be placed underneath it. That means there is now a 152+152=252\frac{1}{52}+\frac{1}{52}=\frac{2}{52}521​+521​=522​ chance of a riffled card going underneath the K♦.

Continuing this procedure, the original bottom card, the K♦, will eventually rise to the top of the deck and be riffled. Once this happens, the deck is randomly shuffled: the order we’re left with is equally as likely as any other order.

So, how many single card riffles does this take? Recall each time a card is placed underneath the K♦, our chances of placing another card increases by 152\frac{1}{52}521​ . We can calculate the number of riffles this would take.

∑i=15252i=521+522+523+...+5252≈235\sum_{i=1}^{52} \frac{52}{i} = \frac{52}{1} + \frac{52}{2} + \frac{52}{3} + ... + \frac{52}{52} \approx 235i=1∑52​i52​=152​+252​+352​+...+5252​≈235

On average, 235 single card riffles will randomly shuffle a deck of cards.

Let’s Riffle

Equations are great, but let’s visualize this! Below is the same ordered deck of cards from before, except the K♦ has been highlighted red so we can follow its journey to the top of the deck.

Click the Riffle button to move the top card somewhere else in the deck randomly.

Did you see where it went? Click again.

Click a bunch more really fast.

Now I could tell you to keep clicking until the highlighted K♦ rises to the top, but as we have already shown, that would take about 235 clicks. Instead, click the Riffle (x10) button to riffle 10 times. Keep riffling until the K♦ moves to the top.

Here is a chart of the K♦’s position in the deck for each riffle. Notice how it takes many riffles to move the K♦ up just a few positions, but once the K♦ starts rising towards the top of the deck, it accelerates.

On average, the K♦ will reach the top position somewhere around 235 riffles. Since this is the average result, there is a chance your first shuffled deck of cards took less riffles (or many more!). To try again, click the Clear button and get riffling.

\* This article was created using [6]Idyll.
\* Shoutout to [7]@mathisonian for help and feedback.
\* The source code is available on [8]Github.


Visible links

HackerNewsBot debug: Calculated post rank: 96 - Loop: 321 - Rank min: 80 - Author rank: 86