[HN Gopher] Prophet: Automatic Forecasting Procedure
___________________________________________________________________
 
Prophet: Automatic Forecasting Procedure
 
Author : klaussilveira
Score  : 161 points
Date   : 2023-09-26 18:35 UTC (4 hours ago)
 
web link (github.com)
w3m dump (github.com)
 
| loehnsberg wrote:
| From my own experience, a properly cross-validated lasso
| regression over a wide range of autoregressive features beats FB
| Prophet by a good margin and offers nearly the same degree of
| automation.
 
| krawczstef wrote:
| This library is old news? Is there anything new that they've
| added that's noteworthy to take it for another spin?
| 
| [disclaimer I'm a maintainer of Hamilton] Otherwise FYI Prophet
| gels well with https://github.com/DAGWorks-Inc/hamilton for
| setting up your features and dataset for fitting &
| prediction[/disclaimer].
 
| atwoodjw wrote:
| Prophet is a PITA to install with PyPy on Apple Silicon. Beware.
 
| davmre wrote:
| Prophet has gotten a lot of attention since being released in
| 2017, I think because the idea of a fully automatic solution is
| very appealing to people. One of the original developers, Sean
| Taylor, recently posted a nice retrospective on the project's
| successes and failures:
| https://medium.com/@seanjtaylor/a-personal-retrospective-on-...
| He quotes one of his earlier tweets:                 If I could
| build it again, I'd start with automating the evaluation of
| forecasts. It's silly to build models if you're not willing to
| commit to an evaluation procedure. I'd also probably remove most
| of the automation of the modeling. People should explicitly make
| these choices.
| 
| Having worked on similar Bayesian time-series forecasting tools
| at Google, this matches my experience (though I've never used
| Prophet seriously, so please don't take this as any direct
| judgement of it as a software package). There is a lot of value
| in a framework that lets you easily experiment with different
| model structures (our version of this was the structural time
| series tools in TensorFlow Probability, see, e.g.,
| https://blog.tensorflow.org/2019/03/structural-time-series-m...).
| But if you're forecasting something you actually care about, it's
| usually worth the time to try to understand yourself what
| structure makes sense for your problem, and do a careful
| evaluation on held-out data with respect to whatever metric
| you're really trying to optimize. A fully automated search over
| model structures is cute, but even when it works, it mostly just
| ends up rediscovering properties of the data you could or should
| have already known (e.g., of course traffic to your work-related
| website will have a day-of-week effect), so the cases where it
| really adds practical value are harder to find than you might
| like.
| 
| Even in the age of deep learning, I do think these relatively
| classical Bayesian models have a lot of value for many
| applications. Time-series forecasting tends to be a case where:
| 
| - you don't have a ton of iid data points (often, only a single
| time series),
| 
| - you'd like forecasts with principled uncertainty estimates,
| e.g., credible intervals, giving you a range of scenarios to plan
| for,
| 
| - you often do have a pretty good idea of what features are
| relevant to the process you're predicting, and
| 
| - you want to understand in detail what features the forecast is
| accounting for (and what it might be missing),
| 
| all of which play to the strengths of more classical, structured
| statistical models, compared to more data-hungry black-box deep
| learning models. So the basic ideas in Prophet and similar tools
| do still have a lot of relevance going forward, IMHO.
 
  | esafak wrote:
  | You mention classical models but Bayesian deep learning is a
  | thing too. One can even retrofit existing DL models to obtain
  | uncertainty estimates, at the expense of increasing (possibly
  | doubling) the number of model parameters.
  | 
  | The quality of the uncertainty estimates is a question though.
 
| whimsicalism wrote:
| Time series forecasting is not at all solved. Prophet does not
| solve it for you.
 
| NeoTar wrote:
| I am intrigued on how this would perform on astronomical data.
| 
| If anyone is not aware there are many periodic phenomena in
| astronomy - e.g. variable stars which can have periods from
| minutes to hundreds of days.
| 
| The description of this library sounds like it's very tied to the
| human world - talking about yearly, weekly and daily seasonality.
| 
| [Weirdly though, we do sometimes see variability on 'human'
| timescales in astronomical data series. If maintenance is carried
| out weekly on a Monday that can add a signal into the data
| through missing datapoints.]
 
| littlestymaar wrote:
| I'd be curious to see how it performs on economics data compared
| to mainstream models (say DSGE) whose results have never
| impressed me with their predictive power.
 
  | 54r4rf wrote:
  | Nonsense vs nonsense. Close call
 
| ipaddr wrote:
| Facebook developers are doing some really great stuff. For some
| reason it doesn't translate into a really great facebook or
| instagram. The experience is worse compared to 10 years ago. If
| they hired 10,001 of the best developers not working at facebook
| I think their products would be the same or worse. Is there a
| single person responsible for the vision?
 
| chubs wrote:
| On this topic, does anyone know of a suitable time-series
| forecaster for multivariate analysis? Eg 8 independent/input
| variables, and one output variable? I've been using multiple
| linear regression (which works impressively!) but it doesn't take
| into account the time series, only the single prior day of
| inputs. Thanks :)
 
| willmeyers wrote:
| Here's some other similar Python packages for forecasting:
| 
| - https://nixtla.github.io/neuralforecast/
| 
| - https://github.com/ourownstory/neural_prophet
 
| pantsforbirds wrote:
| Has anyone else struggled with Prophet? I've experimented with it
| on a few real world datasets and I've had very inconsistent
| results.
 
  | HWR_14 wrote:
  | Based just on the documentation, it seems there are some
  | assumptions they expect the data to adhere to, and if they
  | don't apply then it would not produce good results.
 
  | jgalt212 wrote:
  | maybe because time series forecasting, for any time series of
  | interest, is pretty much not possible.
 
  | ttt333 wrote:
  | Yes. I've tried using it for pretty straightforward time series
  | forecasts, and I struggled to make it into something useful in
  | a business context.
  | 
  | I'll disclaim that I'm just a finance dude and not a data
  | scientist or programmer. But the documentation leads me to
  | believe that I am in the target audience. I felt like I could
  | grasp the basic mechanics after reading the paper, but I wish
  | the documentation could help someone like me be more
  | intelligent with the 'tuning' of the model. I could never get
  | accuracy below 15% average error, which is too large for my use
  | case.
  | 
  | Probably user ignorance, but that's my experience.
 
    | SpaceManNabs wrote:
    | You are the primary audience. Time series forecasting with
    | deep learning is fraught with inconsistency. Someone on r/ML
    | went pretty hard on detailing a survey and the stuff that was
    | SOTA 10 years ago still is. Wish I saved that thread. The
    | dude was well published.
    | 
    | edit: found it https://www.reddit.com/r/MachineLearning/comme
    | nts/pe1lst/r_i...
    | 
    | Turns out it was about time series anomaly detection, but if
    | you can detect, you can forecast if your model is generative
 
      | isaacfung wrote:
      | These may be related.
      | 
      | https://www.reddit.com/r/datascience/comments/11vzjhi/stati
      | s...
      | 
      | https://www.reddit.com/user/eamonnkeogh/submitted/?sort=top
 
        | SpaceManNabs wrote:
        | I updated my comment with the thread but it was actually
        | about time series anomaly detection. Turns out it was the
        | same dude in your second link, and your comment includes
        | forecasting in the first link as well. Thank you!
 
      | Pandabob wrote:
      | When was this? I might go chasing this lead down, but even
      | a fuzzy estimation of when would help. Will come link it
      | here if I find it.
 
        | SpaceManNabs wrote:
        | I updated my comment!
 
    | eep_social wrote:
    | This looks to me like something they'd be using for internal
    | capacity planning. If so, they'd be asking it questions like,
    | "how much capacity do we build out for the upcoming holiday
    | rush?" I wouldn't be surprised if financial datasets are very
    | noisy compared to service capacity metrics. I didn't read the
    | paper though, maybe this is addressed and maybe I'm wrong
    | about the use case! But stuff like the below from the docs
    | reads like capacity planning tool to me:
    | 
    | > As an example, let's look at a time series of the log daily
    | page views for the Wikipedia page for Peyton Manning. We
    | scraped this data using the Wikipediatrend package in R.
    | Peyton Manning provides a nice example because it illustrates
    | some of Prophet's features, like multiple seasonality,
    | changing growth rates, and the ability to model special days
    | (such as Manning's playoff and superbowl appearances).
 
      | philjohn wrote:
      | Also perhaps anomaly detection in a metric.
 
  | sbohacek wrote:
  | I have not been able to get good results either, but I have not
  | tried it in the past year. I also tried many of the
  | architectures in Darts. I have found that fairly
  | straightforward architectures work well. That is, I can iterate
  | on my own design for my own specific data (with all its
  | specific covariates) and get better results than I could with
  | Darts or Prophet.
 
| nighthawk454 wrote:
| Model development on Prophet stopped this year:
| https://medium.com/@cuongduong_35162/facebook-prophet-in-202...
| 
| They recommend checking out these for cutting-edge time series
| forecasting:
| 
| https://neuralprophet.com/
| 
| https://nixtla.github.io/statsforecast/
 
  | mochomocha wrote:
  | Fun fact: if you don't care about the auto-regressive aspect of
  | NeuralProphet (it's turned off by default), you can implement
  | the core of NeuralProphet/Prophet (piecewise linear trend +
  | Fourier on weekly/daily seasonality) in about 60 LOCs with no
  | other dependency than either torch or numpy+scipy.optimize, and
  | without having to deal with Stan or the very poorly chosen
  | heuristics of neuralprophet.
  | 
  | Another thing that both NeuralProphet and Prophet do extremely
  | wrong by default is uncertainty estimation. The coverage
  | probabilities are way off.
 
    | 3abiton wrote:
    | Why is STAN viewed negatively in this light? I am curious why
    | bayesian libraries are the black sheep.
 
    | Donald wrote:
    | Do you have an example implementation of reimplementing the
    | core of these?
 
| rdli wrote:
| As others have pointed out, Prophet is not a particularly good
| model for forecasting, and has been superseded by a multitude of
| other models. If you want to do time series forecasting, I'd
| recommend using Darts: https://github.com/unit8co/darts. Darts
| implements a wide range of models and is fairly easy to use.
| 
| The problem with time series forecasting in general is that they
| make a lot of assumptions on the shape of your data, and you'll
| find you're spending a lot of time figuring out mutating your
| data. For example, they expect that your data comes at a very
| regular interval. This is fine if it's, say, the data from a
| weather station. This doesn't work well in clinical settings
| (imagine a patient admitted into the ER -- there is a burst of
| data, followed by no data).
| 
| That said, there's some interesting stuff out there that I've
| been experimenting with that seems to be more tolerant of
| irregular time series and can be quite useful. If you're
| interested in exchanging ideas, drop me a line (email in my
| profile).
 
| nick0garvey wrote:
| Can someone explain why the "no free lunch theorem" does not
| cause problems here?
| 
| https://en.wikipedia.org/wiki/No_free_lunch_theorem
 
  | wgd wrote:
  | Disclaimer: I haven't looked at the linked library at all, but
  | this is a theoretical discussion which applies to any task of
  | signal prediction.
  | 
  | Out of all possible inputs, there are some that the model works
  | well on and others that it doesn't work well on. The trick is
  | devising an algorithm which works well on the inputs that it
  | will actually encounter in practice.
  | 
  | At the obvious extremes: this library can probably do a great
  | job at predicting linear growth, but there's no way it will
  | ever be better than chance at predicting the output of
  | /dev/random. And in fact, it probably does _worse_ than a
  | constant-zero predictor when applied to a random unbiased input
  | signal.
  | 
  | Except that it's also usually possible to detect such trivially
  | unpredictable signals (obvious way: run the prediction model on
  | all but the last N samples and see how it does at predicting
  | the final N), and fall back to a simpler predictor (like "the
  | next value is always zero" or "the next value is always the
  | same as the previous one") in such cases.
  | 
  | But that algorithm also fails on some class of inputs, like
  | "the signal is perfectly predictable before time T and then
  | becomes random noise". The core insight of the "No Free Lunch"
  | theorem is that when summed across _all possible_ input
  | sequences, no algorithm works any better than another, but the
  | crucial point is that you don 't apply signal predictors to all
  | possible inputs.
  | 
  | Another place this pops up is in data compression. Many
  | (arguably all) compressors work by having a prediction or
  | probability distribution over possible next values, plus a
  | compact way of encoding which of those values was picked.
  | Proving that it's impossible to predict all possible input
  | signals correctly is equivalent to proving that it's impossible
  | to compress all possible inputs.
  | 
  | Another way of thinking about this: Imagine that you're the
  | prediction algorithm. You receive the previous N datapoints as
  | input and are asked for a probability distribution over
  | possible next values. In a theoretical sense every possible
  | value is equally likely, so you should output a uniform
  | distribution, but that provides no compression or useful
  | prediction. Your probabilities have to sum to 1, so the only
  | way you can increase the probability assigned to symbol A is to
  | decrease the weight of symbol B by an equal amount. If the next
  | symbol is A then congratulations, you've successfully done your
  | job! But if the next symbol was actually B then you have now
  | done worse (by any reasonable error metric) than the dumb
  | uniform distribution. If your performance is evaluated over all
  | possible inputs, the win and the loss balance out and you've
  | done exactly as well as the uniform prediction would have.
 
  | tech_ken wrote:
  | Two explanations
  | 
  | First: Prophet is not actually "one model", it's closer to a
  | non-parametric approach than just a single model type. This
  | adds a lot of flexibility on the class of problems it can
  | handle. With that said, Prophet is "flexible" not "universal".
  | A time series of entirely random integers selected from
  | range(0,10) will be handled quite poorly, but fortunately
  | nobody cares about modeling this case.
  | 
  | Second: the same reason that only a small handful of possible
  | stats/ML models get used on virtually all problems. Most
  | problems which people solve with stats/ML share a number of
  | common features which makes it appropriate to use the same
  | model on them (the model's "assumptions"). Applications which
  | don't have these features get treated as edge-cases and
  | ignored, or you write a paper introducing a new type of model
  | to handle it. Consider any ARIMA-type time series model. These
  | are used all the time for many different problem spaces, and
  | are going to do reasonably well on "most" "common" stochastic
  | processes you encounter in "nature", because its constructed to
  | resemble many types of natural processes. It's possible
  | (trivial, even) to conceive of a stochastic process which ARIMA
  | can't really handle (any non-stationary process will work), but
  | in practice most things that ARIMA utterly fails for are not
  | very interesting to model or we have models that work better
  | for that case.
 
| Tarq0n wrote:
| Prophet is such an appealing package because it promises to
| abstract away all the difficult parts of forecasting. However, in
| practice it does not fulfill its promises. I think this is a good
| discussion of the problems:
| https://www.microprediction.com/blog/prophet
 
| alexmolas wrote:
| I'm no time series expert, but from my experience and what I've
| heard, using Prophet for time series forecasting isn't
| recommended. It often leads to less-than-ideal results.
| 
| Curiously, in Medium-like (ie low effort) publications it's still
| the recommended way to tackle a forecasting problem. The promise
| of a model that can solve any time series problem sounds great,
| but not all that glitters is gold, and as you get more experience
| you discover that solutions like this usually don't work.
 
  | wendyshu wrote:
  | Isn't recommended by whom?
 
    | alexmolas wrote:
    | Every time I, or someone at work with more experience than
    | me, have tried Prophet it has ended up in changing the
    | approach and trying a different technique. In my experience
    | with time series hand-crafted recipes tend to work much more
    | better than out-of-the-box solutions.
 
      | techwizrd wrote:
      | I agree completely. We always end up moving away from
      | Prophet every time. The results from Prophet are just not
      | very good, although it can be useful for a proof-of-
      | concept.
 
  | braza wrote:
  | I used Prophet and personally I do not have any problems, but I
  | agree with the criticism that the tool it's extremely focused
  | in ergonomics that abstracts important aspects of the tool that
  | can be used to built better models [1].
  | 
  | [1] - https://ryxcommar.com/2021/11/06/zillow-prophet-time-
  | series-...
 
    | elesiuta wrote:
    | I thought the biggest issue wasn't with the models
    | themselves, but how Zillow decided to apply and act on them,
    | which is why it didn't work in practice.
    | 
    | So on average their predictions may have been pretty good,
    | but since each transaction also depends on the other party to
    | accept their offer, and whether they get outbid, most of
    | their predictions where the offer actually goes through would
    | be on the tail end of where they slightly overestimated the
    | price.
    | 
    | This tweet from the article summed it up nicely
    | 
    | > Zillow made the same mistake that every new quant trader
    | makes early on: Mistaking an adversarial environment for a
    | random one.
    | https://twitter.com/0xdoug/status/1456032851477028870
    | 
    | I was lucky to make and learn from that mistake pretty
    | quickly with some algorithmic trading on much smaller
    | amounts. With housing transactions being much larger and
    | slower, you wouldn't learn this lesson until it was too late.
    | Models never perform as well in practice as they do in
    | theory, and you need to remember to account for both known
    | unknowns and unknown unknowns.
 
  | pantsforbirds wrote:
  | I've honestly had consistently better results with standard
  | regression models. I really love the idea of it, and maybe I
  | need to be tuning it better somehow, but overall I haven't had
  | a great experience.
 
| simonhughes22 wrote:
| Wondering how many people are now downloading this and other libs
| like Dart and trying to do stock market prediction or crypto
| price forecasting. Most of the devs i know, myself included, have
| dabbled in coding up trading algorithms at some point in time.
 
  | beckingz wrote:
  | It's the classic data nerd trap.
  | 
  | "I'm pretty good at statistics and can predict things using
  | software... I bet I could make money in the stock market"
  | 
  | And then they realize just how hard it is.
 
    | SpaceManNabs wrote:
    | the hard part isn't the stats. it is all the information that
    | people buy and setting up those ingest pipelines! If i had a
    | satellite telling me when a certain big company has a lot of
    | cars in the lot parked after hours, I could make a zillion
    | bucks too!
 
| dang wrote:
| Related. Others?
| 
|  _Zillow, Prophet, time series, and prices_ -
| https://news.ycombinator.com/item?id=29137200 - Nov 2021 (143
| comments)
| 
|  _Is Facebook 's "Prophet" the time-series Messiah or just a
| naughty boy?_ - https://news.ycombinator.com/item?id=27695574 -
| July 2021 (78 comments)
 
| whymauri wrote:
| Also relevant: https://news.ycombinator.com/item?id=27695574
 
  | Terretta wrote:
  | This is the HN comment thread on a well-written skeptical
  | article with this zinger:
  | 
  |  _"You can imagine my disappointment when, out-of-the-box,
  | Prophet was beaten soundly by a 'take the last value'
  | forecast."_
 
___________________________________________________________________
(page generated 2023-09-26 23:00 UTC)