|
| citilife wrote:
| My team and I wrote an NLP application to detect sensitive data
| and detect / validate schemas, etc as well as the other items
| provided by pandas-profiling.
|
| https://github.com/capitalone/DataProfiler
|
| That being said, we noted the same thing. It shouldn't matter
| what modeling you use. It's the data pipelining where 99% of the
| work typically is. Modeling itself always needs the same basic
| input -- matrix of data and outputs a matrix of data.
|
| Some libraries are good at specific components. Others have
| improved speeds ups, etc. But it's all so new it's effectively
| going to change month-to-month. So I always tell the team to
| build what you can as fast as you can, with the tools you have.
| We can always update it later, once the pipeline is in place.
| MontyCarloHall wrote:
| I generally agree with the point made in this article, although
| I'll point out that it's only been true for the last couple of
| years. Until TensorFlow completely revamped its syntax in v2.0,
| scrapping the previous graph-based syntax for PyTorch-like eager
| execution, writing code in TF was much more time-consuming than
| in PyTorch, since you had to define the entire computational
| graph before you could execute it as a single unit. This made
| iterative debugging extremely painful, since you couldn't
| interactively execute individual steps within the graph.
|
| These days, thankfully, the choice of framework comes down mostly
| to (a) minor syntactic preferences and (b) specific functionality
| available in one framework but not another. For example, although
| I generally prefer PyTorch's syntax since it's closer to numpy's,
| TF supports far more probability distributions (and operations on
| those distributions) than PyTorch. When working on a model in
| PyTorch, if I discover that I need that additional functionality,
| it's easy enough to convert all my code to TF.
___________________________________________________________________
(page generated 2021-07-22 23:01 UTC) |